WORLD INTELLECTUAL PROPERTY ORGANIZATION 
International Bureau 




PCX 

INTERNATIONAL APPLICATION PUBLISHED UNDER THE PATENT COOPERATION TREATY (PCT) 



(51) International Patent Classification ^ : 

C12N 15/12, C12Q 1/68, C07K 14/465, 
14/47, AOIK 67/027, C07K 16/18 



Al 



(11) InternaUonai Publication Number: WO 96/39505 

(43) International Publication Date: 12 December 1996 (12.12.96) 



(21) International Application Number: PCT/GB96/01341 

(22) International Filing Date: 5 June 1996 (05.06.96) 



(30) Priority Data: 

9511439,3 



6 June 1995 (06.06.95) 



GB 



(71) Applicant {for all designated States except US): ISIS INNO- 

VATION LIMITED [GB/GB]; 2 South Parks Road, Oxford 
OXl 3UB (GB). 

(72) Inventors; and 

(75) Inventors/Applicants (for US only): GRIFFITHS, Richard 
[GB/GB]; 4 Upway Road, Oxfoid OX3 9QH (GB). TI- 
WARI, Bela [CA/GB]; 4 Upway Road, Oxford 0X3 9QH 
(GB). 

(74) Agent: PENNANT, Pyers; Stevens Hewlett & Perkins, 1 
Serjeants* Inn, Fleet Street, London EC4Y ILL (GB), 



(81) Designated States: AU, IL, JP, NZ, US, European patent (AT, 
BE. CH, DE, DK, ES, FI, FR, GB, GR, IE, IT, LU, MC, 
NL, PT, SE). 



Published 

With international search report. 

Before the expiration of the time limit for amending the 
claims and to be republished in the event of the receipt of 
amendments. 



(54) Tide: AVIAN GHD GENES AND THEIR USE IN METHODS FOR SEX IDENTEFICATION IN BIRDS 



(57) Abstract 



The invention provides for DNA sequences (introns and exons) encoding two genes located on the Z and W chromosomes of birds. 
These genes can be used to identify the sex of most birds (Class Aves), In addidon, the genes control the sex of the birds, a genetic 
process which is initiated as an embryo. Nucleic acid fragments, proteins, polypeptides, antibodies and related products and their use in 
medicine and agriculture arc provided. The invention may be used in sex identification or for controlling the sex of adults or the progeny 
of commercially important animals* 



FOR THE PURPOSES OF INFORMATION ONLY 



Codes used to identify States party to the PCT on the front pages of pamphlets publishing international 
applications under the PCT. 



AM 


Annenia 


GB 


United Kingdom 


MW 


Malawi 


AT 


Austria 


GE 


Georgia 


MX 


Mexico 


AU 


Australia 


GN 


Guinea 


NE 


Niger 


BB 


Barbados 


GR 


Greece 


NL 


Netherlands 


BE 


Belgium 


HU 


Hungary 


NO 


Norway 


BF 


Burkina Faso 


IE 


Ireland 


NZ 


New Zealand 


BG 


Bulgaria 


IT 


Italy 


PL 


Poland 


BJ 


Benin 


JP 


Japan 


PT 


Portugal 


BR 


Brazil 


KE 


Kenya 


RO 


Romania 


BY 


Belarus 


KG 


Kyrgystan 


RU 


Russian Federation 


CA 


Canada 


KP 


Democratic People*s Republic 


SD 


Sudan 


CF 


Central African Republic 




of Korea 


SE 


Sweden 


CG 


Congo 


KR 


Republic of Korea 


SG 


Singapore 


CH 


Switzerland 


KZ 


Kazakhstan 


SI 


Slovenia 


CI 


CGtc d'lvoire 


LI 


Liechtenstein 


SK 


Slovakia 


CM 


Cameroon 


LK 


Sri Lanka 


SN 


Senegal 


CN 


China 


LR 


Liberia 


sz 


Swaziland 


CS 


Czechoslovakia 


LT 


Lithuania 


TD 


Chad 


CZ 


Czech Republic 


LU 


Luxembourg 


TG 


Togo 


DE 


Germany 


LV 


Latvia 


TJ 


Tajikistan 


DK 


Denmark 


MC 


Monaco 


TT 


Trinidad and Tobago 


EE 


Estonia 


MD 


Republic of Moldova 


UA 


Ukraine 


ES 


Spain 


MG 


Madagascar 


UG 


Uganda 


n 


Finland 


ML 


Mali 


US 


United States of America 


FR 


France 


MN 


Mongolia 


uz 


Uzbekistan 


GA 


Gabon 


MR 


Mauritania 


VN 


Viet Nam 



wo 96/39505 



- 1 - 



PCT/GB96/01341 



Avian CHD genes and their use in methods for sex Identification in birds 

5 Introduction 

The present invention relates to proteins, polypeptides, 
nucleic acid fragments, antibodies and related products and to their use in 
medicine and agriculture, for instance in diagnosis and therapy. More 
0 particularly the invention relates to a gene or genes which can be used to 
ascertain the sex of avian adults, embryos, cells, and tissues. These 
- genes also control the sex of birds starting with action in the embryos and 
so control the sex of the progeny of birds 

Much of our understanding of sex determination comes from 
5 three, extensively studied, model systems. In two of these, the fruitfly 
Drosophila melanogaster and the nematode Caenorhabitis elegans, it is 
the ratio of X chromosomes to autosomes that initiates sexual 
differentiation (Hodgkin 1992). In the mouse a single gene, SRY, located 
on the Y chromosome provides the impetus for male development; a 
0 pattern that is thought to be conserved throughout the mammals 
(Koopman etal. 1991 Foster, et al. 1992). 

At the genetical level these three species employ very 
different molecular mechanisms, not only to control sex detennination itself 
but to accommodate the differing dosages of genes that result from the 
5 males possessing a single X and the female two X chromosomes. These 
basic differences are largely due to the independent evolution of the three 
mechanisms and strongly suggests that other means of sex determination 
will have evolved elsewhere in the animal kingdom. 

One class in which little is known about sex determination is 
0 the birds. They exhibit female heterogamety which means that the female 



wo 96/39505 



-2- 



PCT/GB96/01341 



has Z and W sex chromosomes and the male 22. This immediately 
suggests that sex detemriination in this class has an independent origin to 
that of their sister class, the mammals where it is the male that is 
heterogametic. Furthermore, it has been shown that whilst female 
5 mammals inactivate one of their X chromosomes as a method of dosage 
compensation (Grant & Chapman 1988), this does not seem to be a device 
employed by birds (Baverstock etal. 1982). 

However, similarities do exist between the birds and 
mammals. The W chromosome, like the Y chromosome is usually smaller 

10 than its partner, and is also characteristically heterochromatic in 

appearance (Christidis 1990). The main exceptions to this rule are found 
in the 'primitive' representatives of both classes: the monotremes and the 
ratites where the morphological differences between the sex chromosomes 
are poorly defined (Graves 1987, Tagaki et aL 1972). 

15 The heterochromatization of the W and Y results from the 

replacement of functional genetic loci with 'junk DNA' sequences. This 
process is thought to be a consequence of a suppression of recombination 
that has arisen to ensure that genes vital to the development of the 
heterogametic sex remain linked on the Y or W chromosome 

20 (Charlesworth 1 991 ). As a result only a few genes such as Ube1y (Kay et 
aL 1 991 . Mitchell et aL 1 991 ). (Page et aL 1 987) and SRY itself remain 
on the mammalian Y chromosome. A similar situation is thought to prevail 
on the avian W chromosome where the presence of any functional genes 
has yet to be demonstrated, although it does possess vast arrays of 

25 repetitive elements (Griffiths & Holland 1 990, Tone et aL 1982) . 

A further similarity in sex determination in birds and mammals 
is that the development of the male phenotype appears crucially 
dependent on the appearance of the testis. The female phenotype is the 
result of the 'default pathway'. For mammals this was first demonstrated 

30 by Jost (1947) who grafted an embroyonic testis into genetically female 
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rabbit embryos prior to sex determination. This was sufficient to allow the 
development of functional males. The same experiment has been carried 
out on chick embryos with comparable results (Stoll et al. 1978). 

Once the testis has fornied, the process of masculinization is 
adopted by the testicular hormones. The genetical switch that initiates 
testis determination is known to be SRY in mammals (Koopman et al. 
1991). In birds, there appears to be no SRV homologue on the W 
chromosome (Griffiths 1991), although this is unsurprising given the 
separate evolution of sex detemiination in the two classes. 

The only other pertinent evidence on the genetics of avian 
sex determination come from reports of chickens with abnormal 
chromosome complements. Table 1 shows data from Crew (1954) and 
McCarrey and Abbott (1979) on the phenotypes of the aneuploids so far 
described. These results suggest that the presence of the W chromosome 
in the aneuploid AA ZZW and the polyploid AAA ZZW has not acted as a 
dominant determinant of the female phenotype. This may mean that sex In 
birds may be determined more by the autosome to Z ratio, as in Drosphila 
and C. elegans. However, a ZO aneuploid which could confirm this 
hypothesis has yet to be described. 

It must also be born in mind that XXY kangaroos, where SRY 
is thought to be the key male determining switch, exhibit both male and 
female characteristics (Graves 1987). This suggests that the limited 
aneuploid data that is available for birds should be interpreted with some 
caution. 

To conclude, the genetic mechanism that controls sex 
determination in birds has not yet been elucidated. Here we suggest that a 
gene we have termed CHD-VV (Chromodomain-He//case-DNA binding on 
the W chromosome) alone or acting in conjunction with a closely related 
gene CHD-1A (Chromodomain-He//case-DNA binding 1 Avian) initiates 
female development in birds. 
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The Invention 

It Is believed that all birds such as chickens and other species 
of commercial significance, will have two or more genes of the CHD type 
which will have a nucleotide sequence similar to the nucleotide sequences 
shown in Fig. 5, Fig. 7 and Fig. 8 and that the gene products will be 
proteins which are cmcial to the detennination of the sex of the organism. 
One of these genes will be located on the W chromosome and the other on 
an autosome or Z chromosome. 

It will be understood that the exact sequence of the two 
genes will vary between species and between individuals of the same 
species at least at the nucleotide level and often also at the protein level. 
Complete or partial sequences of the chicken genes are shown in Fig. 5, 
Fig. 7 and Fig. 8. The gene or protein which contains sequence 
corresponding to those in Fig. 5. Fig. 7 and Fig. 8 will hereafter be referred 
to as an CHD-gene and proteins and fragments thereof, polypeptides, 
nucleic acids and fragments thereof and oligonucleotides containing part of 
a CHD gene will hereafter be referred to as CHD-proteins. CHD-nucleic 
acids and so on. 

The present invention therefore provides a CHD-protein or a 
fragment thereof or polypeptide comprising a CHD-gene or a part thereof, 
subject to the proviso below. 

The present invention also provides a protein or a fragment 
thereof or a polypeptide containing a mimetope of an epitope of a CHD- 
proteln or fragment thereof of polypeptide containing a CHD-gene or a part 
thereof, subject to the proviso below. Such proteins, fragments and 
polypeptides are hereafter referred to as CHD-mimetope proteins or 
fragments thereof and CHD-mimetope polypeptides. 

The present invention also provides a CHD-nucleic acid or a 
fragment thereof or oligonucleotide comprising a CHD-gene, or a part 
thereof subject to the proviso below. 
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In a particular aspect the present invention provides a single 
or double stranded nucleic acid comprising the CHD-gene of a bird or a 
part thereof of at least 17 contiguous nucleotide bases or base pairs, or a 
single or double stranded nucleic acid hybridizable with the CHD-gene of a 
5 bird, or part thereof of at least 1 7 contiguous nucleotide bases or base 
pairs, subject to the proviso below. 

The invention further provides a nucleic acid or fragment 
thereof or an oligonucleotide encoding a CHD-protein or fragment thereof 
or a polypeptide comprising a CHD-gene or a part thereof or a CHD- 

10 mimetope protein or a fragment thereof or CHD-mimetope polypeptide, 
subject to the following proviso. These nucleic acids, fragments and 
oligonucleotides may have sequences differing from the sequences of 
CHD-nucleic acids, fragments and oligonucleotides due to alternative 
codon usage and/or encoding alternative amino acids sequences or 

15 mimetopes. 

The present invention does not, however extend to any 
known protein or fragment thereof or polypeptide or nucleic acid or 
fragment thereof or oligonucleotide containing a CHD-gene related 
sequence such as the Saccharomyces cerivisiae SNF2/SWI2 gene, 

20 Drosophila polycomb and HP1 genes described below, insofar as that 
protein or fragment, polypeptide, nucleic acid or fragment or 
oligonucleotide is known perse. 

The amino acid sequence of the CHD-gene has similarities to 
the chromobox and Helicase motifs of a number of discovered genes 

25 known to be involved in the remodelling of chromatin. This suggests that 
the CHD-protein of the present invention may have a regulatory function 
involving chromatin remodelling. However, none of these genes contain 
the chromobox and the Helicase of the CHD-gene which are conserved in 
conjunction, at least in the chicken, great tit, mouse and yeast but are not 

30 conserved in conjunction in the sequences of chromatin remodelling 
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proteins not associated with sex determination at least at the stage of testis 
fomiation in birds. A gene that produces a protein having chromatin 
remodelling capacity but lacking these characteristic motifs is therefore 
outside the scope of the present invention. 

In addition there are certain residues in the amino acid 
sequence of the chromobox and those residues immediately downstream 
thereof, of the CHD-gene which are also conserved at least between those 
found in the chicken, great tit, mouse and yeast but are not conserved in 
the sequences of chromatin remodelling proteins not associated with sex 
determination at least at the stage of testis formation in birds. Any one of 
these conserved residues is therefore considered characteristic of the 
CHD-gene proteins of the present invention. The characteristics of a 
CHD-chron<obox will give a more complete and comprehensive description 
of the CHD-chromobox which can also be considered characteristic of the 
CHD-gene proteins of the present invention. A protein having chromatin 
remodelling capacity and a helicase motif but originating from a gene that 
lacks all or most of these characteristic amino acid residues in the 
chromobox motif is therefore outside the scope of the present invention. 

The characteristic amino acids residues are shown in the 
alignment in Fig. 1 1 . which is described in more detail below. When 
aligned with the illustrated sequences as shown, these residues fall at 
positions, 11,12, 20, 27. 34 inside the chromobox and 3. 6, 8, 12-15. 16 
immediately downstream. 

The nucleotide base sequence of the CHD-gene includes 
bases which encode the chromobox and Helicase motifs of chromatin 
remodelling proteins as described above. However, the base sequence of 
the CHD-nucleic acids of the gene will include codons specifying both or 
either chromobox and Helicase motifs and the former will have codons 
specifying one or more of the characteristic amino acid residues described 
above and/or will be hybridizable with a sequence that controls the sex 
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determination of birds under conditions which substantially prevent 
hybridization to other sequences in birds that do not have these 
characteristics. 

Preferably the CHD-nucleic acids of the invention encode a 
5 chromobox and a helicase and one or more, preferably ail, of the 
characteristic chromobox amino acid residues and meet the above 
hybridization requirements. 

Fragments of CHD-nucleic acids according to the present 
invention will likewise contain codons specifying the chromobox and 

10 helicase motifs or including at least part of either of these motifs or CHD- 
gene adjacent to the codons encoding these features and/or will be 
hybridizable with a sequence that controls the sex detemriination of birds 
under conditions which substantially prevent hybridization to other 
sequences in birds that do not have these characteristics. 

15 ~ Oligonucleotides containing the CHD-gene or a part thereof 

according to the present invention may contain codons specifying the 
chromobox or helicase motifs or including at least part of these motifs or 
CHD-gene but this is not essential. However ail such oligonucleotides of 
the invention must be capable of hybridizing with a sequence or sequences 

20 that control the sex determination of birds or a gene intron, preferably 
under conditions which substantially prevent hybridization with dny 
sequence not associated with sex determining sequence. 

A sex determining sequence referred to herein is a sequence 
which contains the CHD-gene and which encodes a factor which when 

25 expressed at the appropriate stage and level during embryo development 
may result in testis formation and subsequent growth of the embryo as a 
male. It may alternatively refer to a sequence which encodes a factor 
which when expressed at the appropriate stage and level during embryo 
development prevents testis formation and results in the subsequent 

30 growth of the embryo as a female. 
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The hybridization conditions referred to above whicii prevent 
unwanted hybridization with sequences not associated with the sex 
determining gene will depend to some extent on the length of the nucleic 
acid, fragment or oligonucleotide of the invention tested. Thus for instance 
lower stringency will be sufficient to secure hybridization to sequences 
associated with the sex determining gene whilst preventing unwanted 
hybridization when the nucleic acid or fragments several thousand 
nucleotide base pairs in length than for a fragment of only a few hundreds 
of bases or an oligonucleotide of from 17 bases up to a few tens or 
hundreds of bases. With the smallest oligonucleotides and fragments of 
the invention hybridization conditions will be such that only complete 
complementarity between the oligonucleotide and or fragment and the 
sequences associated with the sex determining gene will result in 
hybridization. 

Prefen-ed nucleic acids and fragments of the invention will 
only hybridize selectively to the sequences associated with the sex 
detemiining gene or genes under conditions requiring at least 80%, for 
instance 85, 90 or even 95% more preferably 99% complementarity. Yet 
more preferred nucleic acids and fragments of the invention are those 
having a sequence corresponding exactly to that of those illustrated in Fig. 
5. Fig. 7 and Fig. 8 although the nucleotide sequences by be longer or 
shorter than those illustrated and or may contain nomrially intronic 
sequences associated with these sequences 

The invention particularly provides an oligonucleotide, 
polypeptide, nucleic acid or protein comprising the entire sequence of the 
CHD-gene of a bird and more preferably comprising the entire amino acid 
or nucleotide sequence of the chicken as set out in any one of Figs 1, 3, 5. 
7, 8, 9, 10, 11. 

The nucleic acids hybridizable with the CHD-gene of a bird 
are preferably hybridizable under moderate, or more preferably, high 
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stringency conditions as defined below: 
Moderate stringency: 



Buffer: 2 x SSC 

Temp: sqoq 

Annealing period: 6-8hrs 

High stringency: 

Buffer: 1 x SSC 

Temp: 65°C 

annealing period: 6-8hrs 



Moderate stringency as defined above corresponds with 
about 75% homology. High stringency as defined above corresponds with 
about 90% homology. 1 x SSC is 0.15 M sodium chloride, 0.015 M sodium 
citrate, pH 7.0. 

Preferably the portion of the nucleic acid corresponding to or 
hybridizable with the CHD-gene is at least 20. more preferably at least 30, 
40 or 60 and most preferably 100 or more nucleotide bases in length. 

The nucleotide strands of the invention may be single or 
double stranded DNA or RNA. DNA's of the invention may comprise 
coding and/or non-coding sequences and/or transcriptional and or 
translational start and/or stop signals and/or regulatory, signal and/or 
control sequences such as promoters, enhancers and/or polyadenylation 
sites, endonuclease restriction sites and/or splice donor and/or acceptor, in 
addition to the CHD-gene sequence. Included within the DNA's of the 
invention are genomic DNA's and complementary DNA's (cDNA's) 
including functional genes or at least an exon containing the CHD-gene. 
They may also contain non-coding sequences such as one or more 
introns. Single stranded DNA may be the transcribed strand or the non- 
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transcribed (complementary) strand. The nucleic acids may be present in 
a vector, for instance a cloning or expression vector, such as a plasmid or 
cosmid or a viral genomic nucleic acid. RNA's of the invention include 
unprocessed and processed transcripts of DNA, messenger RNA (mRNA) 
containing the CHD-gene and anti-sense RNA containing a sequence 
complementary to the CHD-gene. 

Nucleic acids of the present invention are particularly useful 
as primers for polymerase chain reactions (PCRs) conducted to ascertain 
the sex of a bird as defined below. They may also be used to express 
proteins or fragments or polypeptides corresponding to the whole or a part 
of a CHD-protein (whether or not containing a CHD-gene) or as probes in 
hybridization experiments. As used herein the term "fragments" used in 
connection with proteins is intended to refer to both chemically produced 
and recombinant portions of proteins. 

The CHD-proteins and fragments thereof and polypeptides 
containing the CHD-gene or a part thereof and CHD-mimetope proteins 
and fragments thereof and CHD-mimeotope polypeptides of the invention 
are useful in immunodiagnostic testing and for raising antibodies such as 
monoclonal antibodies for such uses. Antibodies against such proteins 
and fragments and polypeptides as well as fragments of such antibodies 
(which antibody fragments include at least one antigen binding site) 
including chemically derived and recombinant fragments of such 
antibodies, and cells, such as eukaryotic cells, for instance hybridomas and 
prokaryotic recombinant ceils capable of expressing and, preferably 
secreting antibodies or fragments thereof against such proteins or 
fragments, also fomn part of the present invention. 

The nucleic acids of the invention may be obtained by 
conventional means such as by the recovery from organisms using PCR 
technology or hybridization probes, by de novo synthesis or a combination 
thereof, by cloning the CHD-nucleic acids described below or a fragment 
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thereof or by other techniques well known in the art of recombinant DNA 
technology. 

Proteins and fragments thereof and polypeptides of the 
invention may be recovered from cells of organisms expressing a CHD- 
5 gene or generated by expression of a CHD-gene or coding sequence 
contained in a nucleic acid of the present invention in an appropriate 
expression system and host, or obtained by de novo synthesis or a 
combination thereof, by techniques well known in the art of recombinant 
DNA technology. The proteins, fragments thereof and polypeptides of the 
10 invention will contain naturally occurring L-a-amino acids and may also 

contain one or more non-naturally occurring a-amino acids having the D- or 
L- configuration 

Antibodies may be obtained by immunization of a suitable 
host animal and recovery of the antibodies, by culture of antibody 

15 producing cells obtained from suitably immunized host animals or by in 
vitro stimulation of B-cells with a suitable CHD-protein, fragment or 
polypeptide or CHD-mimetope, protein, fragment or polypeptide and 
culture of the cells. Such cells may be immortalized as necessary for 
instance by fusion with myeloma cells. Antibody fragments may be 

20 obtained by well known chemical and biotechnological methods^ 

All these techniques are well known to practitioners of the 
arts of biotechnology. Reference may particularly be made to the well 
known text book "Molecular cloning: A laboratory manual" 2nd Edition (Eds 
Sambrook, J., Fritsch, E.F. and Maniatis, T.). (Cold Spring Harbour 

25 Laboratory, New York, 1989), hereafter referred to as "Maniatis". 

The invention further provides the use of a nucleic acid, 
protein, polypeptide, antibody, or antibody producing cell as hereinbefore 
defined including the SNF2/SWI2, polycomb and HP1 or other chromobox 
or helicase containing protein for ascertaining the sex of a cell or organism 

30 of a bird or for isolating nucleic acids useful in ascertaining the sex of a bird 
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and for instituting single sex breeding programmes. 

Knowledge of the chicken or great tit sex determining gene or 
genes can be used to isolate the equivalent gene or genes from other 
birds. Once isolated from a particular species, this gene or genes and its 
sequence can typically be used in two types of application: 

1 . The construction of sequence based sexing tests which can 
be applied to embryos, tissues and other biological materials containing 
nucleic acids. 

2. The genetic modification of the germ line of birds to create 
breeding systems that produce offspring statistically biased towards one 
sex or of one sex only (single sex breeding systems). 

A particularly preferred technique for ascertaining the sex of a 
bird in accordance with the invention involves the use of an 
oligonucleotides as primers in a PGR, for instance as follows: 

A cell or cells or remains thereof are obtained, for instance by 
surgical removal from an embryo or from the quill of a feather, and the 
DNA is released by a crude lysis procedure for instance using a detergent 
or by heating. Primer olignucleotides of the invention are used to initiate a 
conventional PGR in order to amplify W chromosome linked GHD-related 
DNA from the cells. The products of the PGR are analysed by agarose gel 
electrophoresis and detected using labelled probes or by visual inspection. 
The presence of amplified CHD-W DNA indicates the presence of a CHD- 
Wgene in the cells and thus, in birds, that the cell(s) were female. An 
example of a similar technique has been carried out by Griffiths & Tiwari 
(1995) on the Spix's Macaw (Cyanopsitta spixii). This is the world rarest 
bird (Guiness Book of Records) and DNA obtained from a moulted feather 
was sufficient to allow nested PGR amplification with GHD primers to show 
the bird was a male. 

This technique may be applied for instance to identify the sex 
of embryos or adults for subsequent breeding programs in other bird 



i 
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species. or to control the sex of the progeny of breeding stock for 
commercial exploitation (by selection of the breeding stock or by slaughter 
or termination of animals of undesired sex). 

The oligonucleotide primers for ascertaining or controlling sex 
in one species may also be used to ascertain or control sex in another 
species since hybridization of the primers to the CHD-gene of the other 
species will still serve to amplify the species-specific sequences. 

Techniques for conducting such determinations are well 
known in the art of recombinant DNA technology. 

In another aspect the present invention provides a process 
for isolating a W-chromosome specific sequence associated with the CHD- 
Wgene of a bird which comprises probing a genomic library from a female 
of the species preferably of W chromosome sequences, for instance of 
lambda phage, cosmid or YAC library or cDNA library constructed from a 
tissue expressing the gene, with a probe comprising a nucleic acid, 
fragment or oligonucleotide of the invention as hereinbefore defined and a 
detectable label under high or moderate stringency. 

Using the newly isolated subclone. Southern blots are 
performed on male and female DNA of the species of interest at high 
stringency to confirm that the correct clone has been isolated. The CHD- 
gene probe should give a female specific signal (other male/female shared 
bands may also be present at lesser intensities). The subclone is 
sequenced using standard methods and primers suitable for PGR chosen 
from the sequence so identified. 

Alternatively, other approaches to cloning the sequences 
related to the sex determining gene could be used such as PGR methods 
using "degenerate" oligonucleotides. (For methods in PGR see, for 
example, "PGR Protocols - a Guide to Methods and Application"; edited by 
M.A. Innis, D.H. Gelfand, J.J. Sninsky, T.J. White; published by Academic 
Press, Inc.). 
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Preferably the probe is CHD-1A or CHD-W or a fragment 
thereof or a nucleic acid or fragment or oligonucleotide having a sequence 
exactly as set out in Fig. 5. Fig. 7 or Fig. 8 for the chicken. Techniques for 
forming a genomic or cDNA library and for probing and detecting the 
5 detectable label and isolating the nucleic acid identified by the probe are 
well known in the art of biotechnology and recombinant DNA manipulation. 
The process may be conducted for instance using a probe having the 
chicken sequence such as the CHD-UV sequence to identify and isolate the 
corresponding sequence from another bird such as Turkey. The thus- 

10 identified sequence can then be used to generate primers for PGR which in 
turn can be used to ascertain the sex of an individual or of cells, tissues, 
embryos or ovaries of the bird. This technique has been used by obtaining 
DNA from the Chicken and Hyacinth Macaw (Anodorhynchus hyacinthinus) 
to design primers for the Spix's Macaw (Griffiths & Tiwari 1995). This will 

15 permit experiments to ascertain sex to be conducted and controlled sex 
breeding of the bird as described below. 

In addition, the nucleotide sequence of the CHD-genes are 
sufficiently conserved so that CHD primers can be designed that will allow 
PGR in a range of bird species. The primers P1, P2 and P3 shown in 

20 Figure 14 will allow CHD-W and GHD-1 A amplification in a range of birds 
that allows sex to be identified. 

The isolated nucleic acid, fragment or oligonucleotide may 
thereafter be amplified, cloned or sub-cloned as necessary. The invention 
further provides a process for detecting the sex of an individual bird or of 

25 cells, tissues, embryos, foetuses or ovaries or a bird, comprising 

conducting a polymerase chain reaction using DNA from the individual, 
cell, tissue, embryo or ovary as template and a nucleic acid, fragment or 
oligonucleotide of the invention as primer. Preferably the nucleic acid, 
fragment or oligonucleotide of the invention used as primer is CHD-W or 

30 CHD-1A or a part thereof and has a sequence corresponding exactly to the 



wo 96/39505 



PCT/GB96/01341 



- 15- 

chicken sequence in Fig. 5, Fig. 7 or Fig. 8 or a part thereof or is a nucleic 
acid, fragment or oligonucleotide which is a W-chromosome specific 
sequence associated with the sex determining gene or genes of a bird of 
the same species as the individual cell, tissue, embryo, foetus or ovary 
whose sex is to be ascertained. The W-chromosome specific sequence 
associated with the sex determining gene or genes of the bird involved 
may itself have been obtained by the process of isolation and amplification 
or cloning described above. It can also be obtained by deduction from the 
sequence in Fig. 5, Fig. 7 or Fig. 8 or a sequence from another bird or 
animal. 

The identification of the sex determining gene or genes 
according to the present invention raises the possibility of controlling the 
sex of progeny of commercially important animals such as chickens, 
turkeys and other avians. This will be valuable in many aspects of animal 
breeding and husbandry such as where one sex has more desirable 
characteristics, for instance only female progeny are desired for egg-laying 
breeds of chicken. The economic advantages of single sex breeding 
programmes and strategies for instituting these are described for instance 
in "Exploiting New Technologies in Animal Breeding; Genetic 
Developments". (Eds. Smith. C. King. J.Q.B. and McKay. J.C.).-(Oxford 
University Press, Oxford. 1986). 

The nucleic acids making up all or part of the sex determining 
gene, from the same or different animal species, can be introduced into 
any early embryo through established transgenic technology. This latter 
includes microinjection of DNA into pronuclei or nuclei of early embryos, 
the use of retroviral vectors with either early embryos or embryonic stem 
cells, or any transformation technique, (including microinjection, 
electroporation or carrier techniques) into embryonic stem cells or other 
cells able to give rise to functional germ cells. These procedures will allow 
the derivation of individual transgenic animals (founder transgenics) or 
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chimeric animals composed in part of cells carrying the introduced DNA. 
Where the functional gemi cells of the founder transgenic or chimeric 
animal carry the introduced DNA it will be possible to obtain transmission 
of the introduced DNA to offspring and to generate lines or strains of 
animals carrying these DNA sequences. 

The nucleic acids making up part or all of the coding 
sequence of the sex determining gene, or derivatives of it. may be 
introduced in combination with its own regulatory sequences 
(promoter/enhancers etc.) or regulatory sequences from another gene, the 
whole making the "constmct", to give expression from the construct at an 
appropriate developmental stage and tissue location critical to sex 
determination in the bird species under consideration. For example, in the 
chicken this would be between 6 and 7 days post lay. 

Materials and Methods 



Isolation of pGT-W, pGT1.7 and pG78 Great Tit clones 

A great tit {Parus major) library was constructed from 
genomic DNA, partially restricted with Mbol, and the IFixll vector 
(Stratagene). The library was screened at high stringency with the 724bp 
probe (GT-W) cloned from a W chromosome specific polymerase chain 
reaction (PGR) product derived from the great tit (Griffiths & Tiwari 1993). 
Positive plaques were subject to two rounds of purification. Clone IGT2 
contained an insert of 9.6kb that hybridized strongly to the probe 
sequence. The insert was subcloned as two EcoRI fragments of 1 .7kb 
(PGT1.7) and 8kb (pGT8) into EcoRI cut pT7/T3 (Phannacia). 

isolation of CHD genes from the chicken 

Two chicken cDNA libraries were screened. The first was a mixed sex 
chick stage 10-12 cDNA library in IZapll which had been reamplified on 2 
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10 



15 



20 



25 



occasions This library was provided by Dr I. J. Mason. The second library 
was constructed from mixed sex, 10 day chick mRNA. Total RNA was 
extracted using a guanidine thiocyanate based technique (Koopman 1993) 
and mRNA Isolated using a Promega PolyATtract system 1000. A IZapll 
library was constructed using a Stratagene ZAP-cDNA synthesis kit. 
Plaques (2x10^) from the stage 10-12 day library were screened at 
moderate stringency with a subcloned 433bp Hindlll/SacI fragment from 
pGT8 that contained the 123bp region with identity to the mouse CHD-1 
gene (Delmas et al. 1993). A similar number of plaques from both libraries 
were screened with bases 428-4428 of CHD-1 A (see Fig. 5 ). The 10 day 
library was also screened with bases 4059-5303 of CHD-1 A (see Fig. 5). 
Positive plaques were purified prior to the excision of pBluescript plasmids 
and cloned inserts insert from IZapH using techniques recommended by 
Stratagene. 

Sequencing 

All sequencing was carried out using the T7 DNA 
polymerase/7-deaza-dGTP chain termination sequencing kit from USB. All 
sequencing unless otherwise specified was carried out in both directions 
either by subcloning or through exonuclease III deletion with thePromega 
Erase-a-Base system. 

Southern Blot Analysis and Hybridization 

Genomic DNA was extracted from blood (Griffiths & Holland 
1990), digested with the appropriate restriction enzyme and Southern 
blotted onto Zeta-Probe GT under neutral conditions as described by the 
manufacturer (Bio-Rad). Prehybridizations and hybridizations were carried 
out in 0.25M Na2HP03/5% SDS at either 65°C (high stringency) or 62°C 
(moderate stringency). Subsequent washes were carried out for a total of 
1 hour in three changes of either 0.5 x SSC (75mM NaCI/7.5mM sodium 
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Citrate (pH7.5))/0.1%SDS at 6S\C (high stringency) or 1 x SSC/0.1%SDS 
at 45'»C (low stringency). 



Sex identification with PGR on dried and limited DNA in a 
Spix's IVIacaw 

Stratagene provided a genomic Hyacinth Macaw Lambda 
Fixll Library (Cat. No. 946402). Plaques were screened at moderate 
stringency with a 1 .3Kb Chicken CHD-W subclone (spans 2670-4003 
nucleotides in the related Mouse CHD1 gene (Delmas etal., 1993)). a " 
CHD-lVgenomic fragment was isolated and aligned to the chicken and 
mouse homologues to allow the design and construction of 3 primers (5' to 
3*) P3 AGATATTCCGGATCTGATAGTGA, 
P2 TCTGCATCGCTAAATCCTTT and 
P1 ATATTCTGGATCTGATAGTGA(C/r)TC. 

DNA from the wild Spix's Macaw was extracted (Thomas & 
Paabo 1993) from 1cm portions of the tips of 3 moulted flight feathers 
collected in 1994 and 1995. The negative extraction control was taken 
through an identical procedure. 1 .5% of these extraction products or 50ng 
of genomic DNA from the reference samples were subject to semi-nested 
PGR. Primary amplification consisted of 20 cycles with primers P3 and P2; 
1% of the primary PGR product was subject to 30 cycles of amplification 
with P2 and PI. Samples were denatured for 1.5 min at 95°C then cycled 
between 57°C/30 sec. 72°C/15 sec and 94°C/30 sec with a 5 min final 
extension. Products were precipitated, cut with Ddel, reprecipitated and 
electrophoresed through visigel separation matrix (Stratagene). The 
accuracy of the test was confinned using DNA from Spix's and Hyacinth 
Macaws of known sex (n=5 p=0.03). Uncut secondary PGR product from 
the wild bird was isolated (Dretzen etaL 1981). cloned using the 
Stratagene pGR-Script SK(+) kit and sequenced to confirm that the product 
had originated from a Spix's Macaw 
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Sex identification with PCR in a variety of birds 

DNA was isolated from blood taken from Chicken (5 
individuals used). Marsh Harrier (28; Circus aeruginosus) and Kestrel (18 
Falco tinninculus) all sexed by adult plumage. Bee-eater (4; Merops 
apiaster, plumage/behaviour), Boobook Owl (2; Ninox novaesiae), White- 
faced Owl (2; Ptilopsis leuctis) Burrowing Owl (2; Speotyto cumcularia). 
Eurasian Eagle Owl (2; Bubo bubo). Long-eared Owl (2; Asio otus), Tawny 
Owl (3; Strix aluco, adult size), Starling (5; Stumus vulgaris; Beak colour) 
and African Marsh Warbler (5; Acrocephalus baeticatus; reproductive 
behaviour). DNA from a variety of parrots sexed by laparotomy was also 
used: Blue Fronted Amazon (3; Amazona a aestiva). Orange Winged 
Amazon (5; Amazona amazonica), Red Lored Amazon (3; Amazona 
autumnalis), Yellow Crowned Amazon (2; Amazona o ochrocephala), 
Tucamen Amazon (2; Amazona tucamana), Blue and Gold Macaw (6; Ara 
ararauna), Citron Crested Cockatoo (2; Cacatua sulphurea citronocristate). 
Lesser patagonian (2; Cyanolisous patagonus), Blue Headed Pionus (1; 
Pionus menstruus). Plum Headed Parakeet (4; Psittacula cyanocephala), 
African Grey Parrot (12; Psitiacus erithacus). Blue Throated Conure (2; 
Pynrhura cruentata), Senegal Parrot (3; Seneglus poicephalus). 

All the birds listed above were sexed from DNA using exactly 
the same PCR reaction. PCR reaction volumes of 20\x\ were made up of 
Promega Taq buffer (1x is 50mM KCI, 10mM Tris.HCI. 1.5mM MgClj. 0.1% 
Triton X-100), 200^M of each dNTP. P2 (5'-TCTGCATCGCTAAATCCTTT) 
and P3 (5'- AGATATTCCGGATCTGATA) primers (approx l^iM). 50-200ng 
of genomic DNA and 0.1 5 units of Taq polymerase. The thennal treatment 
was 94°C/1.5mins followed by 30 cycles of 55 or SBX/ISsec, 72°C/15sec, 
and 94°C/30sec with a finish of 56°C/1min and 72°C/5min. Haelll (5 units; 
Promega) was used to cut 8ul of PCR product in 1x Promega restriction 
enzyme buffer 3 and SOngVI bovine serum albumin (Sigma) in a total 
volume of 10^l. The digests and uncut PCR product were precipitated 
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before being eiectrophoresed in a visigel (Stratagene) with ethidium 
bromide (40ng/ml) at 3.5V/cm. 



Results 

The plasmid pGT-W contains a 724bp insert that hybridizes 
to a 4.9kb fragment only in the female great tit. Its DNA sequence was 
determined (Fig.1) and contains a 457bp open reading frame. A search of 
the EMBL DNA and protein sequence database found no significant 
matches. The sequence does contain a simple sequence consisting of a 
22bp run of thymidines. 

The pGT-W insert was used to probe Southern blots, at low 
stringency, of Pvull restricted genomic DNA of male and female great tit. 
starling, jackdaw (Con/us monedula), pied wagtail {Motacilla alba) and a 
species of new world flycatcher. These are species that cover the 
extremes of the passeriforme order according to the recent phytogeny of 
Sibley et al. (1988). In all but the jackdaw convincing hybridization to a 
single female specific fragment could be observed. In all species, 
hybridization to one or more non-sex specific fragments was also shown. 
A similar experiment was earned out with a non-passerine, the bee-eater 
(Merops apiastei), and this too resulted in faint hybridization to a female 
specific fragment and two. somewhat stronger bands, in both sexes. 

In order to further investigate the nature of the pGT-W insert 
we attempted to clone a larger fragment of genomic DNA which 
incorporated this motif. From around 1 .5 x1 0= plaques from a great tit 
genomic library, two positives were obtained. After purification one of 
these gave superior hybridization and was investigated further. The 9.7kb 
insert was subcloned as pGT1.7 and pGT8 containing 1.7kb and 8 kb 
respectively. The pGT1.7 was sequenced in its entirety and approximately 
2.8kb of the sequence of pGT8 was determined. Both were sequenced in 
a single direction. A 723bp region, starting 133bp from the 5' end of pGT8 
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had a sequence that corresponded exactly to the pGT-W insert (Fig. 2). 

The sequences derived from these subclones were used to 
search the EMBL database using the FASTA algorithms (GCG, Wisconsin 
package vers 7.3). A region of 123bp, starting 994bp from the 5' end of 
5 pGT8, showed a 79% nucleotide sequence identity to bases 3855-3977 of 
the mouse CHD-1 gene (Fig. 3; Delmas etal. 1993). This corresponds to 
an 88% identity at the amino acid level. 

Southern blots of Pvull digests of genomic DNA from male 
and female chicken and lesser black-backed gull (Lams fuscus) were 
10 probed at low stringency with a 433bp Sacl/Hindlll fragment of pGT8 that 
included the 123bp region with CHD-1 identity (Fig. 4). Figure 12 shows 
that in the chicken hybridization was with a fragment of 3.1kb in the female 
only and with fragments of 1.5 and 6.0kb in both sexes. In the gull 
hybridization is similarly with a female specific fragment of 4.0kb a 
15 fragment of 3.0kb in males and females. 

Delmas et al., (1993) have already demonstrated the 
universal occurrence of the CHD-1 in the mammals. The evidence this blot 
provides, which features species representing both the major divisions of 
the birds, suggests that a minimum of two types of CHD gene exist in this 
20 Class. The first we termed CHD-W\o denote its W linkage. The 123bp 
region from the great tit would appear to be a short exon from this gene. 
The second hypothetical gene is closely related to CHD-Wan6 we have it 
termed CHD-1A, where the A denotes its avian nature. This gene is either 
2 or autosomally linked as it occurs in both sexes. 

25 

Isolation of CHD-1 A 

The Sacl/Hindlll great tit probe was used at low stringency to 
screen a IZap II cDNA library from stage 10-12 (33-49hrs after the 
appearance of the primitive streak) chicken embryos. A plating of 2x10* 
30 plaques yielded a panel of 25 positive clones. 1 9 of these continued to 
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hybridize intensely after purification. From three clones Z4, Z6 and Z11 a 
composite 6608 nucleotide sequence (Fig. 5) was detemiined using the 
strategy illustrated in Fig. 6. 

The insert from the Z6 clone (bases 418-4426; Fig. 5) and a 
5 Bgl/I (AGATCT) fragment of the Z4 clone (bases 4059-5303; Fig. 5) were 
used separately to screen a similar number of plaques from a second 
cDNA library constructed from 10 day old chicken embryos. This 
screening identified a total of 45 positives of which 16 were found to have 
sequence identity with the composite sequence derived from the first 
10 library. Two additional clones contained a closely related sequence that is 
dealt with below. 

A proportion of the clones from both libraries show variation 
from the sequence given in Fig. 5 in one respect. Clones Z1, Z13, Z17, 
Z20 and Z23 are identical to the composite sequence 5' to base 4327 from 

1 5 there they terminate in an additional 37 to 1 63 bases of a new sequence 
that is identical in all five. Two clones from the second library CC43 and 
CC56 have 22 or 254bp of the same sequence at their 5* ends. 
Downstream of this motif both clones regained homology with the 
composite sequence at base 4328 and show no further deviation from the 

20 original sequence. From these seven clones a composite 264bp sequence 
can be derived and this is illustrated in Fig. 7. None of the seven clones 
contain the whole of this sequence. Moreover, none of the ten clones that 
span the 4327/4328 insertion point contain any of this additional region. If 
inserted at this position, the motif has an in frame, open reading frame 

25 spanning its entire length. The motif is extremely adenosine rich and this 
makes the amino acid lysine extremely common in the putative translation 
(see Fig. 7). There are no splice donor or acceptor sites within the motif 
suggesting it is a final rather than an intermediary product of splicing. 

Hybridization of a probe running from 2534 to 4428bp of the 

30 sequence chicken sequence to a blot of Pvull cut, male and female 
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chicken genomic DNA shows that hybridization occurs to fragments that 
are both W and autosomally or Z chromosomally located. The level of 
hybridization is significantly stronger to the fragments common to both 
sexes suggesting that the probe represents the CHD-1A gene. 

CHD-1A is very closely related to the mouse CHD-1 gene 
being 79.8% identical in a 5152nt overlap. At the amino acid level the 
identity is raised to 90% over 1750 residues. We do have an additional 
1202bp of the 3' untranslated region but have not encountered a clone with 
an AATAAA termination signal or a 3' homopolymeric T tail. Both mouse 
and chicken sequences contain a stop codon in the same relative positions 
and sequence similarity is insignificant after this point. The published 
mouse sequence does not contain the additional 264bp motif described 
above. 

The database search also identified an unpublished chicken 
derived sequence tagged as a delta crystallln binding protein (DCBP), with 
even greater identity than the mouse CHD-1 gene: 99% over 2293 bp and 
94% over 571 amino acid residues. The DCBP sequence is of 2292bp 
which extends over nucleotides 1922 to 4214 of CHD-1 A (Fig. 5). Despite 
the high nucleotide sequence identity the region of amino acid similarity 
does not extend the full length of the DCBP. This is due to apparent 
deletions in the DCBP clone that provides an initiation methionine codon 
(257nt DCBP) and a stop codon (1939nt DCBP). The extremely high 
sequence identity, the fact that identity is maintained after the apparent 
stop in the DCBP sequence, that none of the 41 CHD-related clones we 
found have exact sequence identity and that only small sequencing 
mistakes would be required to introduce false stop and start codons 
suggests that the DCBP sequence is CHD-1 A but has been sequenced 
slightly inaccurately. Further evidence is required to confirm this. 

The database search with the whole CHD-1 A gene also 
revealed significant identity to a previously unidentified portion of a 15 kb 
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region of S. cerivisiae chromosome V. This region comprises an open 
reading frame of 4.4kb which lies between the RAD4 (Gietz & Prakash 
1 988) and the poly-A binding protein (Sachs et al. 1 986) gene coding 
regions. In an overlap of 1538 amino acids, the whole of the yeast open 
5 reading frame, there is an identity of 37.7% and a similarity of 59% (Fig. 
10). The degree of conservation this similarity implies suggests the yeast 
sequence encodes a homologue of CHD-1A that we shall term CHD-1Y for 
the sake of discussion. 

... Delmas et al.. (1 993) identified four motifs in CHD-1 with 

10 possible functional significance. CHD-1 A retains such close homology to 
CHD-1 that these regions are virtually unchanged and are likely to perform 
similar functions as they do in the mouse. 

The first motif is a chromodomain (Paro & Hogness 1 99 1 ) 
which falls between residues 274 and 311 (Fig. 9). Figure 11 compares 

15 the amino sequence of this region to that of eight others identified through 
a search of the EMBL database. The sequences fall into three categories. 
The first comprises the domain from CHD-1 , CHD-1 A and CHD-1 Y. The 
second and third chromobox groups have been previously identified by 
(Pearce et al. 1992). The HP1 class comprises the Drosophila (James & 

20 Elgin 1986) and human (Saunders et al. 1993) HP1 genes and two murine 
modifier {Mod) genes (Singh etal. 1991). The HP1 class is characterized 
mainly by glutamic acid rich block of six residues upstream of the 
chromobox. The third group, the Pc class, comprises the Drosophila Pc 
gene (Paro & Hogness 1991) itself and its putative murine homologue the 

25 Mod3 gene (Pearce et al. 1 992). 

A search of the EMBL data base with the CHD-1 A putative 
helicase domain (residues 451-91 1 . Fig. 9) raises the identity between this 
and CHD.1Yto 55% in an overlap of 471 amino acids. There is also 
significant, but lesser identity to. the putative helicase motifs in the human 

30 (Okabe al. 1992), and S. cerivisiae (Laurent et al. 1992) SNF2 gene, 
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human (Muchardt & Yaniv 1993) and Drosophila Brahma (Tamkun etal. 
1992), S. cerivisiae NPS1/STH1 (Laurent etal. 1992. Tsuchiya etaL 
1992), human excision repair protein ECCR6 (Troelstra et al. 1992) and 
the/?yAD54(Emeryefa/. 1991) and MOT1 (Davis etal. 1992) genes of S. 
5 cerivisiae. It should be noted that none of these latter genes contain a 
chromobox. 

Only the four CHD genes show significant homology to the 
third motif, a DNA binding region identified by Delmas a/..(1993), whilst 
only CHD-1A and CHD-1 have the three short basic HSDHR motif near the 
10 carboxy terminus, although this region is yet to be sequenced in CHD-W. 
The CHD- 7 y gene apparently terminates before this point so does not 
share this motif. An extended discussion of the homology of the mouse 
CHD-1 gene can be found in (Stokes & Perry 1995). 

15 Isolation of CHD-W 

Two. Cei4 and CC4, of eight CHD-1 related clones isolated 
from the 10 day chick embyro library using 349-4359nt of CHD-1 A as a 
probe, overlap (Fig. 5) to provide the 1316bp of sequence given in Fig. 8. 
This is a sequence closely related to. but distinct from CHD-1 A. Identity 

20 over the 1316bp overlap is 90.5% and 90.1% at the nucleotide and amino 
acid level respectively. An alignment of the putative translations of CHD-1, 
CHD-1 A and CHD-W is given in Fig. 9. The amino acid identity between 
CHD-1 and CHD-1 A at 93.4% is marginally lower than that between that of 
CHD-1 and CHD-W, 94.2%. over the same region 

25 The 1 335bp insert of CC4 was used at moderate stringency 

to probe a male/female, Pvull cut genomic blot featuring mouse, ostrich 
(Struthio camelus), chicken, bee-eater and hyacinth macaw (Fig. 13). 
Hybridization with the mouse and ostrich shows no evidence of any sex 
linkage, bands of the same size and equal intensity appearing in both 

30 sexes. Hybridization with the ostrich is particularly strong, greater even 
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than with the cognate sequence in the chicken. This suggests that the 
genome size of the ostrich is considerably smaller than that of the chicken. 
It also demonstrates that the CHD genes cannot be used to sex the ostrich 
and, it is suggested, the other members of the ratites. There is no 
evidence from further work (reported later) that this effect should occur in 
other Parvclasses of the birds (Sibley et a/. 1988). 

In all the bird species apart from the ostrich, hybridization 
occurs with two types of fragment some that are female unique and others 
that are shared between the sexes. In the chicken some of the latter are of 
the same size as those hybridizing with the CHD-1A probe and result from 
cross hybridization under the conditions of low stringency that we 
employed. When probed with the CC4 sequence it is clear that 
hybridization with the female linked fragments is far stronger, at least in the 
chicken than with the shared fragments (bear in mind, also, that the female 
chicken only has a single dosage of the W linked gene). This indicates 
that CC4 is W linked and represents part of CHD-W. 

The DNA contained in the Southern blot of the male and 
female chickens probed in Fig. 13 contained identical amounts of DNA. 
However, examination shows that the shared bands are twice as strong in 
nnales (ZZ) as they are in females (WZ). The only way this could have 
happened is if the CHD-1A gene is Z linked. It is suggested this is the 
case in all birds. 

Sex identification with PGR on dried and limited DNA in a Spix's 
Macaw 

The first test was devised to sex DNA extracted from the 
feathers of the last wild Spix's Macaw. This was the rarest bird on the 
planet and needed to be sexed so a mate could be selected from the 31 
captive birds that remained. The test presented two problems. The first 
was extracting DNA from feathers the second providing a test that would 
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work. 

The procedure was published in Griffiths & Tiwari (1995) 
which covers the extraction of the DNA. The second test was to provide 
DNA from a Hyacinth Macaw which would yield data to allow construction 
of primers. A FIX II library was provided by Stratagene and this was 
probed with the insert of the CHD-1A clone Z6 (-227-5302 Fig. 6) at 
moderate stringency. This provided 7 positive clones (A1 , A2, A7, A8, 
A13, 1.2 and 5C). The inserts were extracted cut with Mbol and subdoned 
into the BamI cut pUC18. This sublibrary was probed again with the Z6 
insert but this time at high stringency. The A12.3 subclone hybridized. 
This was sequenced and contained 1 1 1bp which is aligned to the chicken 
and mouse CHD genes in Fig 14. The similarity of this fragment to the 
chicken CHD- W suggested this was the Hyacinth Macaw homologue of the 
W chromosome located gene. 

The data from A12.3 supplied information for the design of 
the primers required. It also provided evidence that the CHD sequences 
were sufficiently conserved In this region that a single set of primers could 
be designed to amplify both genes. Three primers. PI, P2 and P3. were 
designed to allow seminested PCR (Fig. 14). This technique allowed 
amplification of a 104bp region of both CHD-W and CHD-1A from DNA that 
was available from two captive Spix's Macaws of known sex. In each sex 
the PCR products were of the same size but sequence detemiination 
revealed that the CHD-W derived PCR product possessed a Ddel 
restriction enzyme site which was lacking in the CHD-1A product. Thus 
PCR amplification and Ddel cleavage of male Spix's Macaw DNA yields a 
only single product of 104 base pairs (bp), whilst from female DNA two 
products are apparent, one of 104bp and one of 73bp. The presence of 
the CHD-1A product in both sexes acts as a control to ensure the PCR 
amplification has been successful (Fig 15 & 16). 

DNA was extracted from feathers moulted by the wild Spix's 
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Macaw using a technique devised for the purification of ancient DNA 
(Thomas & Paabo 1993). The PCR-based test described above was used 
to demonstrate that CHD-UV was not present in the sample (see Fig 16). 
This confirmed that the wild bird is male. A female Spix's macaw was 
released in March 1995 as a prospective mate. 

Sex Identification with PGR on a variety of birds 

Birds can be sexed from DNA by showing the presence 
(female: ZW) or absence (male: ZZ) of the female specific W chromosome. 
At the molecular level this is carried out by the recognition of a W-linked 
marker. This can only be done after a W chromosome DNA marker is 
identified in the avian species. The test developed for the Spix's Macaw 
used CHD-W as a W linked marker. The data collected in designing this 
test suggested that this method may work to sex a variety of birds. 

If the same test is to work on other bird species then two 
criteria must be met. The first is whether the PGR primers will amplify both 
CHD genes in other bird species. The Spix's Macaw test used the tiny 
amounts of DNA extracted from feathers so a seminested PGR was 
required. This used 3 primers which are aligned to the Mouse and Ghicken 
CHD nucleotide sequences in Figure 14. The primer sites are highly 
conserved, there Is no difference between the chicken genes and a solitary 
difference between the Mouse and Ghicken in the 5' region of the P2 site. 
Theoretically, the primers should anneal to other bird species and. if a 
reasonable amount of DNA is available (>50ng). a single pair of primers 
should provide sufficient amplification. 

A second requirement for the test is that the PGR products 
can be separated using a restriction endonuclease. In the Spix's Macaw 
the Dde/ enzyme cuts CHD-lVbut not CHD-1A. Figure 14 shows that this 
discrimination would also occur in the Ghicken. However, the Ddel cutting 
site GTNAG is not present in the CHD-1A of Spix's Macaw (GTNGG) nor 
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the Chicken (CANAG) for different reasons. This suggests that the Ddel sit 
is open to mutation so this form of discrimination is unlikely to be 
conserved. Other discriminatory sites are available: Ddel and Maell sites 
are unique to CHD-W and the Haelli, Mboll and Xhol sites to CHD-1A and 
5 can be considered the first option If these fail the CHD-W and CHD-IA 
PGR fragments can be cloned and sequenced so discriminatory sites can 
be discovered. 

The theory we have presented suggests that a sexing test 
based on both avian CHD genes should work on many other bird species. 
10 Does this work in practice? The birds selected for trial are from across the 
avian class: Chicken (5 individuals). Marbled Murrelet (18), Kestrel (8), 
Marsh Harrier (28), Bee-eater (4). 1 pair of six species of Strigidae Owls 
from different genera (see Methods). Starling (5) and African Marsh 
Warbler (5). 

'5 The primers amplify a PGR product of the predicted size in all 

of the birds using primers P2 and P3 on 50-1 OOng of genomic DNA 
extracted from blood. Figure 17 illustrates this for 3 bird species but also 
includes amplification from human DNA. This shows that tests using P2 
and P3 are open to human DNA contamination so appropriate precautions 

20 must be taken. 

The Haelli restriction enzyme cut the CHD-1A fragment alone 
in all 13 species (Fig 17) and, from the sequence data, would also have 
worked on the Spix's Macaw (Fig 16). Figure 17 shows that the CHD-1A in 
males is cut into two fragments (45bp, 59bp) which are not easily visible on 
25 the gel. In females CHD-W\s uncut by Haelli so remains at 104bp. The 
discrimination using Haelli provided correct sex identification in all 
individuals. 
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Discussion 

The CHD genes 

The female specific great tit probe GT-W was described by 
Griffiths and Tiwari (1993) as a means of identifying sex in this species. 
The results presented here suggest this sequence represents part of a 
intron in a W linked gene. By moving downstream from this sequence it 
has been possible to isolate a putative exon from a gene that we have 
named CHD-Wdue to its close sequence identity to the mouse CHD-I 
g|ne (Delmas et al. 1993) and its W location. 

Using the CHD- lA/ fragment we attempted to isolate a similar. 
W linked sequence that Southem blot analysis had shown was present in 
the chicken. From several clones a 6606bp cDNA sequence was 
assembled but although it has close sequence identity to the great tit 
CHD-lVfragment Southem blot analysis shows it is not located on the W 
chromosome. This second gene was tenmed CHD- M (A = avian). This 
blot shows a second gene closely related to CHD-1A is W located. This 
sequence could not be cloned from a stage 10-12 chick cDNA library 
although 19 CHD-IA clones were isolated. However, two clones yielding 
1 347bp of a second CHD gene were isolated along with a further 14 
CHD-IA clones from a day 10 chick cDNA library. Southem blot analysis 
showed that this second clone was W chromosome derived and so 
represents CHD-W. Attempts are undenvay to isolate the remainder of 
CHD-W. 

Southern blots of a variety of bird species showed that 
CHD-W \s W chromosome linked in ail birds except the ostrich. This 
suggests that the gene is sex linked throughout the class with the 
exception of the primitive ratites, which the ostrich represents, where it 
appears to be autosomally located. 

An alternative explanation is that the CHD-W is in fact W 
linked in ratites but occurs in a region of the W chromosome which still 
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recombines with the Z chromosome. If CHD-1A were Z linked, then 
recombination between Z and W linked copies of CHD would maintain their 
sequence identity resulting in the apparently autosomal location indicated 
by the Southern blot. A mammalian example would be the MIC2 and S7S 
5 genes that are located in the pseudoautosomal region of the Y 

chromosome (Ellis & Goodfellow 1989) and would give analogous results 
to those obsen/ed here. 

Two lines of evidence support this alternative hypothesis. 
The first is that the Southern blot analysis suggests that CHD-1A is Z 

10 linked in non-ratites which would make the chromosomal location of the 
CHD-genes consistent throughout the class. Hybridization of CHD-1A to 
genomic blots is apparently stronger to fragments from male birds which 
would resfit from this sex having two copies of any Z linked gene in 
comparison to a single copy in the female (this result is not clear cut and 

15 requires confirmation by chromosomal in situ). The second line of 
evidence is that the sex chromosomes of the ratites are not 
morphologically differentiated as is the case with other birds (Christidis 
1990). Morphological similarity suggests recombination still occurs 
between extensive regions of the ratite Z and W which may include the 

20 CHD genes and so produce the pattern of hybridization obsen/ed. 

Although we have yet to clone the whole of CHD-1A the 
6606bp sequenced so far shows a close identity to the mouse CI-ID-1 gene 
over the putative coding region. It also includes all four features identified 
by Delmas etal. (1993) as having possible functional significance. This 

25 includes a chromodomain. a helicase, a DNA binding motif and a basic, 
five amino acid motif that is repeated three times (Fig. 9). The similarity of 
the sequence derived thus far from CHD-UV to that of CI-ID-1 and CHD-1A 
suggest it will be of similar length and possess these same motifs. We 
have also identified an alternatively spliced form of CHD-1A and CHD-W 

30 which has a similar adenine rich motif inserted at an identical point 
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(4327/4328nt CHD-IA and 1316nt CHD-W). The exact form of these 
alternative mRNAs is yet been elucidated. It is interesting to note that we 
obtained no clones that spanned these breakpoints which contained this 
additional motif; the sequence was built up from partial sequences derived 
from either 5' or 3' terminii of different clones. Delmas a/., (1993) 
produced a mRNA Northern blot probed with fragments of CHD-I occurring 
5- to this breakpoint and discovered an mRNA species of about 4kb. This 
would correspond to a species cleaved near this insertion point. What 
purpose this would serve is unknown. Moreover the putative yeast 
homologue of CHD, CHD-I which was identified from amino acid identity 
to CHD-1A from the genomic sequence on the EMBL database does not 
apparently have a similar motif. This is suggested because the CHD-I Y 
sequence was derived from a genomic clone which would allow the 
identification of any such sequence were it to be spliced in the normal 
manner. 

The significance of the four functional domains found in the 
CHD genes will be discussed in turn. The first, the carboxy-terminal trimer 
repeat of five basic amino acid residues, has no known function and is not 
shared by any other sequences from the EMBL database. Furthermore, 
the CHD-1Ygene which is truncated by a little over 200 amino acid 
residues in comparison to CHD-1 and CHD-1A does not contain this motif 

The second functional domain was identified by Delmas et al. 
(1993) as having sequence selective DNA binding capacity. Whether this 
is highly specific or just to A+T rich regions was not established. They 
also noted that this domain contains Lys-Arg-Pro-Lys-Lys and Arg-Gly-Arg- 
Pro-Arg motifs which enable genes like HMG-1, 01 and Engrailed to bind 
in the minor groove of A+T rich DNA. 

A third functional motif is located towards the N-terminus of 
the CHD-protein and is termed the chromodomain [Chromatin Organization 
Modifier: Paro. 1990 #459]. This is a highly conserved domain of between 
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37-50 amino acids that has been shown to be represented in the genomes 
of plants, nematodes, insects and vertebrates (Singh et al. "1991). Several 
chromobox genes have been isolated from human, mouse and Drosophila 
and have been divided into the polycomb (Pc) class and the 
5 heterochromatin protein-1 (HP1 ) class on the basis of related structure 
(Pearce et al. 1992)). The CHD-genes have a distinct form of the 
chromobox characterized by close homology between yeast and vertebrate 
forms in the 5' half of the box itself but extending a further 17 residues 
downstream. These differences indicate that this forni of the chromobox 

10 defines a third subgroup the CHD class 

The Pc gene forms one of a eponymously named group (Pc- 
g) of about 12 genes defined through homeotic mutants in Drosophila that 
prevent fixation and maintenance of a determined state. They act as 
transcriptional repressors of homeotic genes, notably of the antennapedia 

15 complex (ANT-C; Paro, 1 990). Members of the ANT-C and the other major 
group of Drosophila homeotic genes, the bithorax complex (BX-C), are 
responsible for defining segmental identity during development (Kaufman 
et al. 1980, Lewis 1978). Initially, their expression patterns are designated 
by eariy acting maternal and segmentation genes (see 4,6,7 kennison). 

20 However, these maternal genes are only transiently expressed.' During the 
later stages of development their role as transcriptional activators is 
adopted by an assemblage of genes including the trithorax group (Trx-g). 
whilst many of their repressive effects are assumed by the Pc-g (Kennison 
1993). 

25 The polycomb (Pc) gene itself is perhaps the best studied 

member of the Pc-g. Zink and Paro (1989) used Pc-S-galactose fusion 
proteins to show that it binds to around 100 different sites on the polytene 
chromosome including loci where other members of the Pc-g are located. 
Any disruption of the chromodomain abolishes the specificity of this 

30 reaction (Messmer et al. 1992). However, the Pc-g protein appears to lack 
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any type of endogenous DNA binding capacity so it is thought that it acts 
as part of a protein complex with other components that are responsible for 
the site specific DNA binding (Paro 1990). 

The repressive effects of the Pc-g are thought to be the result 
5 of chromatin compaction. In other words, the DNA is packaged into 
heterochromatin to prevent or reduce the expression of functional genes 
(Paro 1 990). This is a mechanism related to position effect variegation 
(PEV; (Henikoff 1990)). to dosage compensation in mammals which sees 
the complete heterochromatization of one of the female's X chromosomes 
1 0 and possibly to gene imprinting whereby the expression of matemally and 
paternally inherited alleles differs (Peterson & Sapienza 1993). The links 
with PEV have recently been substantiated in that HP1. a recognized 
modifier of PEV, and Pc both contain chromodomains (Paro & Hogness 
1991). Like the Pc protein. HP 1 appears to forni part of a structural 
15 complex that transforms euchromatin to heterochromatin. Furthermore, 
both PEV and the repressive effects of Pc are passed, in a clonal manner 
to daughter cells ((Henikoff 1 990. Struhl 1981); a characteristic also of 
gene imprinting. 

With the CHD-type gene containing both a DNA binding motif 
20 and a chromobox it may appear reasonable to suggest that they encode 
repressors with an endogenous, site selective DNA binding system. 
However. CHD genes contain a further functional motif that is structurally 
related to the Helicases. The sequence identity is closest to the yeast 
SNF2/SWI2 (Abrams etal. 1986) and Drosophila Brahma genes (Tamkun 
25 et ai 1 992). both of which are transcriptional activators. Indeed. Bratima is 
part of the Trx-g which are considered direct antagonists to the Pc-g. 
Other genes which contain more distantly related Helicase domains are 
involved in DNA repair and chromatid separation during mitosis (Laurent et 
al. 1993. Sung etal. 1993). 

gene product has been shown to enhance the 
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transcription of other genes probably as part of a complex that includes 
SWI1, SWI3, SNF5, SNF6 and In conjunction with gene specific DNA 
binding proteins (Laurent etal. 1991, Peterson & Heskowitz 1992). A 
mode of action strikingly similar to that of Pc. 

Although it remains to be formally demonstrated that SWI2 Is 
a helicase, it does have close structural similarities with proven Helicase 
genes and also possesses the required DNA stimulated ATPase activity 
(Laurent et a/. 1993). Laurent et al., go on to postulate that the SWI2 
containing complex may act by two mechanisms-acting either separately or 
in conjunction. In the first they envisage helicase mediated DNA melting to 
allow the egress of RNA polymerase II. Alternatively SH//2 could allow 
chromatin remodelling, in effect overcoming any inhibitory packaging of the 
DNA and so enhancing transcription. 

The juxtaposition of a Helicase and a chromodomain within 
the same gene presents a paradox that may challenge the perceived roles 
of the two motifs. A simple explanation is that alternative splicing could 
remove one or other of these domains prior to translation. However, there 
is little support for this idea from the work of ourselves or Delmas etal., 
(1993). 

An alternative explanation could be due to our lack- of real 
knowledge about the function of the chromobox. Whilst it is well" 
established that Helicases do disassociate DNA and so facilitate 
transcription (Matson & Kaiser-Rogers 1990), the role of the 
chromodomain in repression is based on more circumstantial evidence. 
Pc, as we have seen, does not bind DNA Itself although mutations in the 
chromobox prevent the formation of site specific complexes. It is possible 
that the chromodomain is involved more in maintaining the structural 
integrity of the repressive complex than in the repressive mechanism itself. 
Based on this supposition, the CHD-protein may form a different type of 
complex able to bind at a site dictated or influenced by its own binding 
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domain and activate these loci via helicase activity. 

While both this scenario is speculative it is probable that 
CHD-type genes are active during development and are able to bring 
about heritable changes in transcription. The presence of an endogenous 
DNA binding domain suggests it has fewer targets than Pc. for example, 
which could fomi part of several different active complexes. With CHD-W 
being confined to the W chromosome is likely to have a role in some 
aspect of female development and we suggest this may be critical to the 
determination of gender. In support this hypothesis we were unable to find 
any CHD-W^ clones in a library constructed prior to sex detennination which 
occurs at day 7 (Lutz-Ostertag 1954) but were able to isolate two clones 
from a smaller pool of candidates at day 10. This suggests that the 
expression of CHD-Y may occur at a time consistent with its having a sex 
determining role. 

If CHD-W alone or in conjunction with CHD- 1A causes sex 
determination in birds then several potential mechanisms are plausible. 

(1 ) In the simplest scenario CHD-Y may act as a simple trigger 
like SRy (Koopman 1993) to either cause expression or repression of 
downstream genes In order initiate testis development. 

(2) CHD-W may interact with other autosomal or Z linked genes 
whereby the dosage of CHD-Wm comparison these other factors causes 
initiates development down the male or female pathways. 

A more complicated scenario is if CHD-lVacts in together 
with CHD-1A to cause sexual differentiation. Different mechanisms could 
operate depending whether CHD-1A turns out to be Z linked as we suspect 
or autosomal. 

(3) If CHD-1 A is Z linked, then male birds get two doses of the 
CHD-1A expression product to one in female birds. Perhaps the 1:1 ratio 
of functionally distinct CHD-1 A and CHD- products is what initiates 
female development whilst a double dosage of CHD-1 A results in males. 
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(4) Alternatively, just the single dosage of Z linked CHD-IA 
product could result in female development and expression of CHD-W only 
occurs after sexual differentiation to equalize dosages of functionally 
similar proteins. 

(5) If CHD-1A is autosomal however, it could be envisaged that 
CHD-1A and CHD-Ware functional homotogues and the three doses in 
females (AAW) is required to promote female development, whilst the 
double dosage in males (AA) causes the differentiation of the testis and the 
development of the male phenoype. 

The evidence from aneuploid chickens discussed in the 
introduction, does suggest that the mechanism that does operate involves 
some degree of dosage dependence which tends to exclude mechanism 
(1). However the similarity of CHD-W to HP1, the Pc protein and other 
transcriptional modifiers that act through chromatin remodelling show that 
the expression of this type is crucially dependent on dosage (Locke et al. 
1988) . With the different dosages of gene product and/or potential target 
sites that aneuploids possess it may be that analysis of these type of 
mutants has, thus far, served to confuse the issue. 

Sex Identification 

The first W-chromosome linked DNA was isolated by Tone et 
al. (1982) from the Chicken. Since then, a number of other W-linked avian 
sequences have been discovered (e.g. Griffiths, 1990; Rabenold, 1991; 
Griffiths, 1993). In all but one case, described later, these DNA fragments 
appear to be non-functional repeats. For instance, the related Xhol and 
EcoRI fragments in Chicken may comprise 70-90% of the W chromosome 
(Saitoh et al. 1991). This repeat and others in the Lesser Black-backed 
Gull {Lams fuscus) can be used to sex birds by the rapid dot blotting 
technique (Griffiths & Holland 1990). Other less repetitive W chromosome 
markers can be used to sex birds either by probing Southern blots 
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(Rabenold etsl. 1991, or through the use of PCR (Griffiths & Tiwari 1993) 

The major problem with all non-functional W-linked DNA is 
*e speed w«h which they evolve. The chicken Xho, repeat is fairy typical 
Through low stringency hybridization to a Southern blot it can be used to ' 
' sex the Turkey (Meleagris gallopavo) and the Pheasant (Phaslanus 
versicolor. Saitoh et a/. 1991). These bird species are closely related to 
the Chicken by being members of the family Phasianidae. By contrast the 
functional CHD-lVregion described here is 96% (3/67 Fig 3) identical ' 
between Chicken and Spix's Macaw and this only drops to 86-/. between 
the Chicken CHD-M/and the Mouse CHD1 (15/110 Fig 3). This level of 
conservation means that the chicken CHD-l^probe can be used on 
Southern blots to sex birds from all over the class Aves. 

The only exception to the non-functional avian W-linked 
sequences is OZWMi which is a putaBve gene, cloned f™m a cDNA turkey 
Lbrary. Like CHD-Wm, gene appears to be sex linked in many bird 
species. Unfortunately, so little information has been published in the 
papers that describe DZVmi that the nature ofthe gene remains unknown 
(Dvorak et al. 1992, Halverson 1990, Halverson & Dvorak 1993,. 

For sexing large numbers of birds Southern blot analysis Is 
Slow and expensive. The technique that we have used is based on a PCR 
using P2 and P3 primers followed by a HaeUI digestion of the of the 
amplified product. The digestion distinguishes between the CHD-W 
product which is uncut and the CHD-,^ which is cut. The technique will 
work to sex a range of bird species that span the class Aves. The primers 
target a highly conserved region so are likely to be ■universal' to the birds 
but the discriminatoiy Haein site which cuts CHD-IA but not CHD-W 
shows no real reason to be consen/ed. if Haelll does fail to be 
discriminator, other cutting sites have been suggested or the CHD-W atvi 
CHD-M PCR products can easily be sequenced to look for an alternative 
Alternatively, the different nucleotide sequence of the amplified CHD-W ' 
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and CHD-1A suggests that the two PGR products would be separable on 
an agarose gel of around 3% or a non-denaturing acrylamide gel. This 
would remove the need for a cutting enzyme and may well make the 
sexing technique more easy to use. 
5 The CHD based test appears to be fairly solid but the 

chances of a peculiar mutation in some bird species is not impossible. 
Cases concerning SRY/Sox3 genes on the sex chromosomes in mammals 
supports this claim. In two species of the vole EUobius males have neither 
a Y chromosome nor an SRV gene (Just et al. 1995). In a second case, 

10 four species of Akodon, the Mole Vole, have 1 5-40% of fertile females with 
XY chromosomes and an SRY gene (Bianchi etal. 1993). These 
examples are particularly peculiar in that the SRY gene is accepted as the 
gene that determines sex throughout the mammals. In neither case would 
the detection of SRY reliably inform you of the animals sex. 

15 These examples from the Muridae may never occur with the 

CHD genes of birds. However, it does suggest that sex identification by 
the amplification of CHD- Wand CHD-A/IV should always be validated by a 
test on several individuals in a new species before it is applied. Despite 
this warning, the use of the test described here or by other means using 

20 the CHD-Wor CHD1A, these genes provide a method to sex most bird 
species. 



25 
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Table 1. Sex of domestic fowl with normal and abnormal chromosome 
complements (from McCarrey & Abbott (1979) and Crew (1954)). 



Chromosome complement 


Phenotype 


AAZZ 


Male 


AAZW 


Female 


AAZZW 


Male? 


AAZZZ 


Male 


AAAZZZ 


Male 


AAAZZW 


Intersex/male 



Figure Legends 
Figure 1 . The DNA sequence of the pGT-W insert. 
Figure 2. A map of the 9.6kb insert of the IFixll clone isolated 
from the great tit using pGT-W. pGT1 .7 and pGT8 are the two EcoRf 
subclones into which the fragment was divided. The broken line 
corresponds to the region with absolute sequence identity to the pGT-W 
Insert. The position of the region with identity to the mouse CHD-1 gene is 
indicated. 

Figure 3. An alignment of 123bp fragment of the great tit 
(GT) CHO-lVgene in pGT8 with the autosomal/Z located chicken (C) CHD- 
1A the chicken CHD-Wgene and bases 3855-3977 of the mouse (M) 
CHD.1 gene. An alignment of the deduced amino acid sequence is also 
given. 

Figure 4. The section of pGT8 that hybridized to a female 
specific fragment of 3. 1 kb in the chicken. This probe was also used to 
screen the chicken cDNA library. The hatched line represents the female 
specific great tit motif shown in Fig. 3. 
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Figure 5. The complete nucleotide sequence of CHD-1 A as 
defined by the clones Z4. Z6 and Z1 1 . Two asterisks underlie the position 
where part of the sequence illustrated In Fig7 is spliced onto the 5' or 3' 
ends of a proportion of the clones isolated. The ATG at nucleotide 228 is 
the start codon whilst TAA at 5388 is the stop codon. 

Figure 6. The strategies used to determine the nucleotide 
sequence of CHD-1 A and CHD-lVgiven in Fig. 5 and Fig. 8. The top line 
represents the mouse clone given by (Delmas et al. 1993). The three 'Z' 
clones of CHD-1 A and the 'CC4' and 'CCU' clones of CHD-Wwere 
derived from either a stage 10-12 or a 10 day chick cDNA library 
respectively. Arrows indicate the direction of sequence determination. 
Note Z6 actually ran from -227 to 69. These nucleotides were determined 
and are fo-jnd in Fig 5 

Figure 7. A composite nucleotide sequence and putative 
translation of the motif that is found spliced to a proportion of the 5' or 3' 
terminii of CHD-1 clones or the 3' end of the CHD-I/V clone CC14. The 
portion attached to the CC14 sequence is incomplete. 

Figure 8. A partial nucleotide sequence of CHD-W as 
defined by the clones CC4 and CC14. 

Figure 9. An alignment of the deduced amino acid 
sequences of the chicken (C) CHD-1 A and CHD-W with the mouse (M) 
CHD-1. With gaps introduced to maximize alignment they show a 
sequence identity of 91.6% over 1365 residues. The $ sign indicates start 
and stop codons. Boxed sections are the chromodomain (C), Helicase (H). 
and the region containing the DNA binding domain (B) identified by Delmas 
et al., (1993). A trimer repeat of a basic HSDHR motif is underiined. A* 
denotes residue identity and . similarity. 

Figure 10. An alignment of the deduced amino acid 
sequences of CHD-1 A and CHD-1 Y a putative yeast homologue of the 
chicken gene identified through a search of the EMBL data base. With 
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gaps introduced to maximize alignment they show a sequence identity of 
37.7% over 1538 residues. ] indicates identity and : conservative 
substitution. 

Figure 1 1 . Comparison of 9 chromodomain sequences. 

5 Vertical lines indicate the extent of the chromodomain as defined by Paro 
& Hogness (1991). The top three sequences represent the CHD class of 
chromodomain to add to the HP1 class and Pc class][;-l08k9ouygytrdevz 
as defined by Pearce et al. (1992). The first letter of each annotation 
indicates the animal of origin: C, chicken; M mouse; D, Drosphila; H, 

10 human; Y, S. cerivisiae whilst the remainder identifies the gene type. The 
yeast gene is a possible CHD homologue identified by its close identity to 
the vertebrate forms. * indicates sequence identity within the groups and 
identity between all nine sequences. * indicate amino acid residues inside 
and downstream of the motif that are characteristic of the CHD class 

15 chromobox. 

Figure 12. Genomic Southern blots of DNA from male and 
female chickens and lesser black-backed gulls digested with Pvull and 
probed with a 433bp Hindi I I/Sac fragment of pGTB (Fig 4.) at moderate 
stringency. Hybridization with female linked fragments and fragments 

20 common to both sexes can be observed in both species. Numbers give 
approximate sizes in kilobases. 

Figure 13. Genomic Southern blots of DNA from male (M) 
and female (F) mice, ostrich, chicken, bee-eater and hyacinth macaw 
probed with the 1335bp insert of CC4 at moderate stringency. 

25 Hybridization with mouse and ostrich is with fragments shared by both 
sexes whilst the non-ratite birds show additional hybridization to female 
specific fragments. In these latter species, the signal from female linked 
hybrids is stronger than with autosomal/Z linked fragments indicating that 
the probe is derived from the W chromosome. Numbers give approximate 

30 sizes in kilobases. 
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Figure 14. The nucleotide sequence of part of a single CHD1 
gene isolated from the Mouse and the homologous genes from the 
Chicken, Hyacinth (A12.3 subclone) and Spix's Macaw all arranged as 
putative codons. Dashes denote nucleotides shared with the Mouse 
5 sequence. The primers designed are shown on the diagram. An arrow 
head indicates a non-synonymous mutation in the Spix CHD-W. The Ddel 
(CTNAG) and Haelll (GGCC) sites are underlined. 

Figure 15. The technique of PGR sex identification in the 
Spix's Macaw. Semi-nested PGR amplification is carried out on both sexes 

10 with the primers P2/P3 then P1/P2 to provide products of identical sizes in 
both sexes. The products are then cut with restriction enzyme Ddef which 
cuts only the CHD-W product from the female. The cut products are run on 
a visigel and the difference between the sexes can be visually detected. 
See Fig 17 for an example. 

'5 Figure 16. Ddel restricted PGR products demonstrating that 

remaining wild Spix's Macaw is male. Lane 1. the wild bird 2. negative 
extraction control 3. known male 4. known female. The larger fragment is 
of 1 04 bp and the female W-chromosome specific fragment of 73 bp. 

Figure 17. Sex identification in the Marsh Hamer (MH), 

20 Chicken (C) and African Marsh Warbler (AMW) carried out using an 
identical reaction. For each species genomic DNA of male and female 
birds was subject to PGR with primers P2 and P3 and the product of 1 10bp 
is visible in lanes 1 and 2. In lane 3 the entire male PGR product, amplified 
from CHD-1A, has cut into two parts with HaeJII (65bp. 45bp). In females. 

25 lane 4 this HaeHI cut product is also present but the CHD-IV product 

remains uncut so the sex can be identified. The Kb' lane contains a '1 Kb 
DNA ladder' (BRL), the 'H' lane is PGR reaction with P2 and P3 carried out 
on human genomic DNA and -ve lane contains a negative PGR reaction. 
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CLAiMS 

5 1 • The nucleotide sequences of CHD-1 A and CHD-W as shown 

in Fig. 5, Fig. 7 and Fig. 8. 

2. A done or subclones of CHD-1 A and CHD-W as defined in 
1. 

3. A fragment of CHD-1 A and CHD-W capable of giving W 
10 specific signal on hybridization to a non-ratite bird. 

4. A fragment of CHD-1 A and CHD-W obtainable by restriction 
endonuclease digestion thereof and being capable of giving a W specific 
signal on hybridization to genomic DNA of a non-ratite bird. 

5. A clone or subclone of a fragment according to either of 
15 claims 3 and 4. 

6- A nucleic acid or fragment or oligonucleotide having 

substantially the sequence of CHD-1 A and CHD-W as set out in Fig. 5, Fig. 
7 and Fig 8. 

7. A clone or a subclone of a nucleic acid or fragment or 
20 oligonucleotide according to claim 6. 

8. A nucleic acid or fragment or oligonucleotide having 
substantially the same sequence of the chicken or great tit CHD-gene as 
set out in Figs 1 , 3, 5, 7 or 8. 

9- A nucleic acid or fragment or oligonucleotide being capable of 
25 giving a W chromosome specific signal on hybridization to the genomic 

DNA of a non-ratite bird. 

10- A nucleic acid or fragment or oligonucleotide according to 
claim 4 or claim 9 capable of giving W chromosome specific signal on 
hybridization to the genomic DNA of a chicken, turkey, duck, parrot. 
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il. A nucleic acid or fragment or oligonucleotide according to any 

one of claims 4. 9 and 10 capable of giving W chromosome specific signal 
on hybridization to the genomic DNA of a non-ratite bird under conditions 
of high stringency. 

5 1 2. A nucleic acid or fragment or oligonucleotide according to any 

one of claims 4, 9 and 10 capable of giving W chromosome specific signal 
on hybridization to the genomic DNA of a non-ratite bird under conditions 
of low stringency. 

A nucleic acid or fragment or oligonucleotide according to any 
10 one of the claims 9 to 1 3 containing substantially the sequence of the 
chicken CHD-gene as set out in Fig. 5, Fig. 7 and Fig. 8. 
■14. A nucleic acid or fragment or oligonucleotide encoding a 

CHD-protein, fragment thereof or polypeptide containing a CHD-gene or 
part thereof or encoding a CHD-mimetope protein or fragment thereof or 
15 CHD-mimetope polypeptide. 

15. A process for ascertaining the sex of an embryo, foetus, cell, 
tissue or organism comprising hybridizing a nucleic acid of fragment or 
oligonucleotide according to any one of claims 1 to 14 with DNA or RNA of 
the embryo, foetus, cell, tissue or organism or with cDNA reverse 

20 transcribed from RNA of the embryo, foetus, cell, tissue or organism or 
vyith cDNA or DNA amplified by cloning or polymerase chain reaction from 
DNA or RNA of the embryo, foetus, cell, tissue or organism. 

1 6. Use of a nucleic acid or fragment or oligonucleotide of any 
one of claims 1 to 14 In ascertaining the sex of an embryo, foetus, cell, 

25 tissue or organism. 

"1 7. A process for controlling the sex of the progeny of an 

organism comprising inserting a nucleic acid or fragment or oligonucleotide 
of any one of claims 1-14 into the genome of the organism or progenitor 
thereof. 
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1 8. Use of a nucleic acid or fragment or oligonucleotide of any 
one of claims 1 to 14 in controlling the sex of the progeny of an organism. 

1 9. A CHD-protein. fragment thereof or polypeptide containing a 
CHD-gene of part thereof or a CHD-mimetope protein, fragment thereof or 

5 a CHD-mimetope polypeptide. 

20. A protein or fragment thereof or polypeptide containing a 
CHD-chromobox including at least one of the characteristic amino acid 
residues at position 11. 12, 20, 27 or 31 inside the chromobox or 3, 6, 8. 
12-15 or 16 directly downstream of the chromobox when aligned to best " 

10 effect and as set out in Fig. 11 . 

21 . A protein or fragment thereof or a polypeptide encoded by a 
nucleic acid or fragment or oligonucleotide according to claims 1-14 and 
containing a CHD-chromobox 

22. A process for controlling the sex of the progeny of an 

15 organism comprising supplying exogenously to a cell of the organism or a 
progenitor of the organism a protein or fragment thereof or a polypeptide 
according to any one of claims 19-21 

23. A process according to claim 22 wherein the protein or 
fragment thereof or polypeptide is supplied and activates a CHD-1A or 

20 CHD-l/l/ target gene. 

24. An antibody or fragment thereof against a protein or fragment 
thereof or polypeptide according to any one of claims 1 9-21 . 

25. An antibody producing cell capable of expressing an antibody 
or fragment thereof according to claim 24. 

25 26. Use of a protein or fragment thereof or polypeptide according 

to any one of claims 19-21 or antibody or fragment thereof or cell 
according to claims 24 or 25 in ascertaining the sex of an embryo cell 
tissue or organism. 
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27. A transgenic or chimeric animal having a heterologous 
nucleic acid or fragment or oligonucleotide according to any one of claims 
1 to 14 in the genome of at least the gemi cells of the animal. 

28. Gametes of an animal according to claim 27. 

29. Progeny of an animal according to claim 27. 

30. Progeny according to claim 29 which are transgenic or 
chimeric and have a heterologous nucleic acid or fragment according to 
any one of claims 1-14 in the genome of at least the germ cells of the 
progeny. 

31 . A method of controlling the population of a species of bird 
which comprises introducing an individual member of the species into the 
population, said individual having a copy or copies of a nucleic acid 
fragment or oligonucleotide according to any one of claims 1 to 14 
integrated on a chromosome (carrier chromosome) be it sex linked or 

15 autosomal whereby when the male breeds with other individuals of the 
population the progeny are substantially of one sex or are sexually 
dysftj notional intersexes. 

32. A method according to claim 31 where the nucleic acid 
integrated into the carrier chromosome is homologous to the native 

20 CHD- 1A or CHD-W gene of the bird. 
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Figure 1. 



CCCGGTCGGAGGTTTCAAGGAATGACTAGATGTGGCACTTAGTGCCATGGTCTAGTTGAC 
AAGGTGATGGTTGGTCAAAAGTTGGACTCGATGATCTCAGAGTTTTTTTCCAGCCTTAAT 
AATTCTATGAATTCTGTAATTTTATTCTTGATCTTTTTGAGCGAAGTTTGTTTGGGGATT 
TTAGTTTGGTTTCCCTGTCACTGTTTTCTTTCCTTGAAACTGACTTTCATTTGCAACATG 
AGAATTGCTGTATTTGTCAGGTTACAAGTAGTGCAATGGCTGCTTAGAAGTAGTGAGAAA 
CATTTAGGGAAATACTGGAGTGAAGCAAACACAGTGGTACTGCCAAACTGTAGCTTTGGG 
ATTTGAGGAGCCACAGAGTTGTATATAAATTTGTTTAATGATATCCTGCCCCTGCCTTCC 
ATTAATTGCTTGTTTTATGAAACCACTCTTTTTTTTTTTTTTTTTTTTTTGGCTTCTTCA 
TATCCTGTGGTAATGAGTTAATGCATTTAGAAGCACATGGCAGAACTAGGAGATCTGTGG 
ATGACAGTGGTACAGGAGCTCTGAATTTTTTAGATAAACTATGAGAGTGGAAACAGAAAT 
CTGAGGCTAGTTTCTTGAGCTGACTGTAAATTTTGTGAGAATATTTTCAAGACTACATTA 
GTTGTGTGTTTGAGGAAAAATAAAATGTTTAAGTTGTCCATTCCTTGAAACCTCCCGACC 
GGG 
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Figure 2. 



123bp region with identity to CW-1 

h— H-i . 

P6T1.7 p6T8 



Figure 3. 



M 


CHD- 


1 


ATTCTTCCAG 


ATGATCCTGA 


C 


CHD- 


2A 


ATTTTACCTG 


ATGATCCAGA 


c 


CBD- 


W 


ATTTTACCTG 


ATGATCCAGA 


GT 


CBD^ 




ATTTTACCTG 


ATGACCCAGA 


M 


CBD- 


1 


CAAAAAACCA 


CAAGCAAAAC 


C 


CBD' 


lA 


CAAGAAACCC 


CAGGCAAAGC 


c 


CBD^ 


W 


CAAGAAACCC 


CAGGCTAAGC 


GT 


CBD^ 


w 


CAAGAAACCA 


CAGGCAAAGC 


M 


CBD~ 


1 


AACTACTTAG 


CAGAGATCTT 


C 


CBD' 


lA 


AATTACTGAA 


TAAAGACCTT 


C 


CBD' 


W 


AATTACTGAA 


TAAAGACCTT 


GT 


CBD' 


w 


AATTACTGAA 


TAAAGACCTT 



TAAAAAACCA CAAGCAAAAC AGTTACAGAC 
CAAGAAACCC CAGGCAAAGC AGCTACAGAC 
TAAGAAACCC CAGGCTAAGC AGTTACAGAC 
TAAGAAACCA CAGGCAAAGC AGTTGCAGAC 

AGTTACAGAC CCGTGCAGAC TACCTCATCA 
AGCTACAGAC CCGTGCAGAC TACCTCATTA 
AGTTACAGAC CCGTGCAGAT TACCTCATTA 
AGTTGCAGAC CCGTGCAGAT TACCTCATTA 

GCAAAAAGAG AGGCTCAGAG ACTTTGTGGT GCG 
GCAAQAAAGG AAGCACAAAG GCTTGCTGGT GCA 
GCAAGAAAGG AAGCACAGAG ACTTGCTGGT GCA 
GCAAGAAAAG AAGTGCAAAG ACTTACTGGT GCA 



M CHD-1 ILPDDPDKKPQAKQLQTRADYLIKLLSRDLAKREAQRLCGA 

C CHD- lA ILPDDPDKKPQAKQLCyrRADYLIKLLNKDLARKEAQRLAGA 

C CHD^W ILPDDPDKKPQAKQLQTRADYLIKLLNKDLARKEAQRLAGA 

GT CHD-W ILPDDPDKKPQAKQLQTRADYLIKLLNKDLARKEVQRLTGA 
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Figure 4. 
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i mmmmm 

lo' SSSSSSS iSS?S? 

" "^^^^ 

Si IF^- ™S Si 

5§ SSI^'^ 
17? JSJSISJ JJS?SS??5 ?SSJ?S? 

1801 AACTGTATGG CCCTTTTCTT CTGCGCGTGC SctcJISJ SJSJJS^nS 

Sii ?— * ssss sEE sEE 
iir^ i^^"- ^^^^ ssS 

??ni ^^oSI^^^ CyuVGCTCATC GTTTAAAAAA TGATGACTCT CTTCTctSI 
^^nlll^Z AGACTTTAAG TCCAACCATC GACTTCtSt TACtS^?? 
^in^^^^ ATTCCCTCAA AGAGCTGTGG TCTTTGTTGC ATTTCaJJJS 

ill] ^^^^^'^ TTTTCCTCCT GGGAAGATTT TGAAGAGGAG StGGCAaJg 

iio? ^^f^S*^^ tggttatgca agtcttcaca aagagcttga S?S5J?Ja 
2301 ctaagaagag ttaaaaaaga tgtagaaaag tctttacctg ctaaggSpa 

EEi 

EEl ^^^^ =i sT^I 

tcagatggtg aggatgctgg acatcctagc agaatatctg aagtatcS? 

2701 AGTTTCCCTT CCAGAGACTT GATGGATCAA TAAAAGTCGA JJJS^J 

III] ^^^^'"'"'^ ATCATTTCAA TGCAGAAgSJ TCAG^gSS JJJot????? 

It^} ACTGTCTACA AGAGCTGGAG GATTAGGTAT TAACTTGGCA TCTGCtScI 

2851 CTGTAGTTAT TTTTGATTCT GACTGGAATC CACAGAATGA TC???I^^^ 
2!?? CTCATAGAAT TGGACAGAaS A^^JS SaJS^JS 

2951 GCTAGTCACA AAAGGATCAG TAGAAGAAGA tattcttgaa aSJJJS 
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3001 AGAAGATGGT GCTAGACCAT TTAGTAATTC AGAGAATGGA CACGACAGGA 

3051 AAAACTGTTC TGCATACAGG TTCAACTCCA TCAAGCTCTA CACCTTTTAA 

3101 TAAAGAAGAG TTATCAGCTA TTTTGAAGTT TGGTGCTGAG GAACTCTTTA 

3151 AAGAACCTGA AGGAGAAGAA CAGGAGCCCC AGGAAATGGA TATAGATGAA 

3201 ATCTTGAAGA GAGCTGAAAC TCGGGAAAAT GAGCCAGGTC CATTGACTGT 

3251 AGGGGATGAG TTGCTTTCAC AGTTCAAGGT GGCGAACTTT TCCAATATGG 

3301 ATGAAGATGA TATTGAGTTG GAACCAGAAA GAAATTCAAG AAATTGGGAA 

3351 GAAATCATCC CAGAATCCCA ACGGAGAAGG ATAGAGGAGG AGGAAAGACA 

3401 AAAAGAACTT GAAGAAATAT ACATGCTCCC GAGGATGAGA AACTGTGCAA 

3451 AACAGATCAG CTTTAATGGG AGTGAAGGAA GACGCAGTAG GAGCAGAAGA 

3501 TATTCTGGAT CTGATAGTGA CTCCATCACA GAAAGAAAAC GGCCAAAAAA 

3551 GCGTGGAAGA CCTCGAACCA TTCCTCGAGA AAATATTAAA GGATTTAGTG 

3601 ATGCAGAGAT CAGGCGGTTT ATCAAGAGTT ACAAGAAATT TGGTGGCCCT 

3651 CTGGAAAGGT TAGATGCTGT AGCTAGAGAT GCTGAACTGG TTGATAAATC 

3701 TGAGACAGAC CTTAGACGTT TGGGTGAACT TGTACATAAT GGATGCATTA 

3751 AGGCTTTAAA GGACAATTCA TCTGGACAAG AAAGAGCAGG AGGTAGACTT 

3801 GGGAAAGTTA AAGGCCCAAC GTTTCGAATC TCAGGAGTGC AGGTGAATGC 

3851 AAAACTAGTC ATCTCTCACG AAGAAGAGCT GGCACCACTG CACAAATCCA 

3901 TTCCTTCAGA TCCAGAAGAA AGGAAAAGAT ATGTCATCCC ATGCCACACC 

3951 AAGGCTGCTC ACTTCGATAT AGATTGGGGT AAAGAAGATG ATTCCAATCT 

4001 GTTAGTAGGC ATCTATGAAT ATGGCTATGG CAGCTGGGAA ATGATAAAAA 

4051 TGGATCCAGA TCTCAGCTTA ACACAGAAGA TTTTACCTGA TGATCCAGAC 

4101 AAGAAACCCC AGGCAAAGCA GCTACAGACC CGTGCAGACT ACCTCATTAA 

4151 ATTACTGAAT AAAGACCTTG CAAGAAAGGA AGCACAAAGG CTTGCTGGTG 

4201 CAGGCAATTC CAAGAGAAGG AAGACAAGAA ATAAGAAGAA TAAGATGAAG 

4251 GCTTCAAAAA TAAAAGAAGA AATAAAGAGT GATTCTTCAC CACAACCCTC 

4301 AGAAAAATCT GATGAAGATG ATGAGGAGGA GGATAACAAG GTAAATGAAA 

4351 TGAAATCTGA AAATAAAGAA AAATCTAAAA AAATTCCATT GCTGGATACT 

4401 CCAGTTCATA TTACTGCAAC CAGTGAACCA GTTCCTATCT CAGAAGAATC 

4451 TGAAGAACTC CATCAGAAGA CATTTAGTGT GTGCAAAGAA AGAATGAGGC 

4501 CTGTCAAAGC AGCACTGAAA CAGCTGGATA GACCAGAGAA GGGCCTTTCT 

4551 GAAAGGGAGC AGCTGGAACA TACTAGGCAG TGTCTAATGA AAATTGGGGA 

4601 TCACATTACA GAATGCCTGA AGGAGTACAC AAATCCCGAG CAAATAAAAC 

4651 AGTGGAGGAA AAATTTGTGG ATTTTTGTGT CCAAGTTTAC AGAATTTGAT 

4701 GCCAGAAAGC TGCACAAACT CTACAAACAT GCAATCAAAA AGCGCCAAGA 

4751 GTCTCAGCAA CACAATGACC AAAACATTAG CAGCAATGTG AATACACATG 

4 801 TAATCAGAAA TCCAGATGTG GAAAGACTGA AGGAGACTAC AAACCATGAT 

4851 GATAGTAGCA GGGACAGTTA TTCTTCTGAT AGACATTTAT CACAATACCA 

4901 TGATCATCAC AAAGACAGGC ATCAGGGAGA TGCTTACAAG AAAAGTGACT 

4951 CCAGGAAAAG GCCATATTCA GCCTTCAGTA ATGGAAAAGA TCACAGAGAC 

5001 TGGGATCACT ACAAACAGGA CAGCAGATAC TACAGTGATA GTAAACATAG 

5051 AAAGTTAGAT GACCACAGGA GCAGAGACCA CAGGTCAAAC CTGGAAGGAA 

5101 ACTTAAAAGA CAGCCGGGGT CATTCAGATC ACCGCTCCCA TTCAGACCAC 

5151 AGGATACACT CAGATCACCG TTCCACTTCA GAATACAGCC ATCATAAATC 

5201 TTCGAGAGAT TATAGATACC ACTCAGACTG GCAAATGGAC CACAGAGCTT 

5251 CTGGTAGTGG CCCGAGGTCA CCACTAGATC AGAGGTCTCC TTATGGTTCA 

5301 AGATCTCCCC TAGGACACAG ATCTCCATTT GAACACTCAT CAGATCACAA 

5351 AAGTACACCT GAACATACAT GGAGTAGCCG GAAGACATAA GAAAGACTGA 

5401 CATTTTCTGG ACCTTCTTTT TAGCCATATA CAGTAAACTA ACACAGTAAT 

5451 TGCCTTACAT GACTTGAAAG ATATGGACTG GATATTCTAT CAGTAGCAGT 

5501 ATTGTTACTT CTTTCCAGGA TGCAAGGTCT ATTATCCCAA CAGAAGAAAA 

5551 ATATTTTTGT ATTTAAAGTT TATGCTGCAC TGTGCTGCAA ATGTTGTGGC 

5601 ACTTTTTTTT TAAGAAATGG AAGATGTTTA CTTTTACAGG GACCTCAACA 

5651 CTGCCCCTTT CAGACTGGAT CTTACTATAA AACTCTTCAT GTCAAAGTGG 

5701 TTCTAGGCTG AACACAGATT AAATTATGTT TGTAAATGAA CACTTAAACA 

57 51 CTGACCTGTG CTTATGTTTC AGGAAAGAAT GGGGGATTTA TTTTGTTTTA 

5801 TTTCTTGGTA GAGAACTCTC AAGGACTTTG TTCACTTTCC AAAGCTACTT 

5851 GTTTACATTG TACACTGCGA CCACCTTGCC GCTTTTCATC ACAAGCTTGA 

5901 ATATTTAAAT TCTGTACCTA CAGTTGTAAA ATAGCCAGGA TTTCTCCTGT 

5951 TTGTGATCAG TTATAATGCC TTTTTATGAA ACAAACAAAC AAACAAAAAA 

6 001 CAATTAAAAA AAAAAACACA ACAAAACCAA CAAATGGCTG TAAATTATTG 

6051 TAAATTAATT AAATGAGCTT TTTTCCGTCA GGCTTTTTTT GGCTGTTCCT 

6101 TTCCCCAACA ACTCAGGCCT TCTTTTCACA AAGTCAGTAT ACTTACATGT 

6151 TTTAATAAAA TATCTCGATG GAATCAGAAT GTAAAAATGG GGAAGGGAAT 

6201 ATTTTATTCC ATTTAGTGCT CCTTTTTTAT TGGATACTTT TACATACCTG 

6251 TTTTTGGTTG TTTTATTTTA tTTTTTTTTT CTATTAAACT GTCAGTGTTG 

6 301 TGATTGTTGT AATGAACAGT GAGAATATCC CACTCTAAAC TGTGCCCTGG 

6 351 AAAGCTTTTC AGGTGCATTG GTTTAAAAGA AGGAAGTGTT CTATAGGTGA 
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6401 ACACTTCAAA ACCCAGATCA 

6451 CCCTCTTTAA CATGGGCAAT 

6501 TTTTAGAAGA TTTGAATGAC 

6551 GATTGTACAT AGATAACTTC 

6601 AAAAAACC 



GCCAAGATTC ATTGTAAATC CATTTGTTTT 
AATGTCAAAT GTGCTATGCA GCAGTTAATA 
TTTATTAACA GAATTGTTAC AATGCACACT 
TATCTGACAA ATTAAATTAA CTAAAACCAA 



SUBSTITUTE SHEET (RULE 26) 



wo 96/39505 PCT/GB96/01341 



7/18 



Figure 6. 
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Figure 7. 



DEIVSVKHLHKKIKTB 
CBD-IA 1 GATGAGATTGTTTCAGTGAAACATCTACATAAAAAAATAAAAAC AGAAA 
CBD-W 1 GATGGGATTGTTTCAGTGAAACATCCACATAAAAAAATAAAAGC AGAAA 

DGIVSVKBPBKKIKAB 

KENEEKPEPDIGIKKEA 
CBD-IA 51 AAAGAAAATGAAGAAAAGCCTGAGCCAGATATTGGTATAAAGAAGGAAGCT 
CH£)- W 51 AAAGAAAATGAAGAAAAAGATGAGCCAGAGATTGGTAT AAAGAAGGAAGCT 

KEKEEKDEPEIGIKKEA 

EEKRETKEKENKR ELKR 
CBD-IA 101 GAAGAAAAAAGAGAGACAAAAGAGAAGGAAAATAAAAGGGAATTGAAAAGG 
CHD- V 101 GGAGAAAAAAGAGAGACAAAAGAAAAGGAAAATAAGA 

GEKRETKEKENK 

EKKEKEDKKELKEKDNK 
CBD~1A 151 GAGAAAAAAGAAAAAGAGGATAAGAAAGAATTAAAAGAAAAAGATAATAAA 

EKRE NKVKESTQKEKE V 
CHO-JA 201 GAAAAGAGAGAAAACAAAGTAAAAGAATCCACACAGAAAGAAAAAGAAGTG 

KEEK 
CBD^IA 251 AAGGAAGAGAAG 
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ATTTATCGGC TAGTCACAAA AGGATCAGTA GAAGAAOATA 
10 20 30 40 

ASTGATTCAG AGAATOGACA CCACAOGGAA AACTGXACTA 
90 100 110 120 

AOGAAGAGTT ATCAOCAATT TTCAA UTITO GTOCTGAOGA 
170 180 190 200 

GAGATOGATA TAGATGAAAT CCTGAAGAOG NCTGAAACIC 
250 260 270 280 

ACTTTCACAG TTCAAOGTAG CTAACTmC CAATATGGAT 
330 340 350 360 

ACTGQGAAGA AATCATTCCA GAAOrrCAGT GGCGACGAAT 
"-^ 410 420 430 440 

ATGCTTCCAA GAATGAGAAA CTGTGCAAAA CAGATCAC3Ct 
490 500 510 520 

TTCTOGATCT GATACTGATT CCATC1CAGA AAGAAAACGA 
570 580 590 600 

ACATTAAAGG ATTTAGTGAT GCAGAGATTA GACGATTTAT 
650 660 670 680 

GA30CTATAG CTAGAGATGC TGAOCTAGTT GATAAATCTG 
730 740 750 760 

AIXSCATTAAG GCTTTAAATC ATAATGACTT TGGTCAAGGA 
810 820 830 840 

TCCGAA!CAGC AGGAGTQCAG GTCAATGCAA AOCTAGTCAT 
890 900 910 920 

CCTTCAGATC CAGAAOAAAG GAAAAGATAT GTCAUCCCAT 
970 980 990 1000 

AGAAGATGAT TCCAA3CTOT TAATAGGCAT CTATGAATAT 
1050 1060 1070 1080 

TCAGTTTGAC ACAGAAGATT TTACCTGATG ATCCAGATAA 
1130 1140 1150 1160 

CTCAITAAAT TACTGAATAA AGACCTTOCA AGAAAGGAAG 
1210 1220 1230 1240 

AAOAAGAAGT AAGAAGAATA AAGCAACAAA GGCTC9C 
1290 1300 1310 
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Figure 9. 



C CED-IA nARRYIX3roJIiGHL*RIASQOTyFENIRSTRKGILDLHI*VT*VDriJT^ 

C CaO'lA SVRNSSGBSSRSDDDSJtfSSASGSaSGSSSGSSSDGSSSQSGSSDSBSCSESGSQSgSBSD 

C CSO-IA TSraiaCQV0AKPPKADGSEFWKSSPSIIAVQRSAVLP3tQQQQ---^!UU^ 

M CHD^l TSRZNK-VOAKPPKVDGAErWKSSPSIIAVQRSAMUUCOFQOAQQQRPi^KSGSEED^ 



C CBD'IA 
M CBD~1 



SSE[>SADDSSSETKKKKEK£eD«fQKSGSGSVSGTGSDSE&ABDGaCSSCEESESD¥EP 

SSBDS-D0SSSGAiauaCHMDBDWQKSGSGSFSQliGSDSESEEEIU2KSSCDC^ 
***** ***** .*_***,*♦*♦♦*•»*♦♦ • •*«*** *^ 



C CSD-IA KVK5RKFPSiaKPKSGKKSlX3QK]a^LZ>SSBB£ED0DEDYDKRGSR^ 

M CHD^l KVRSRI^NRSKSroiaCXZI^KKRQZDSSEDBDDeDroNDKR^^ 

C CHD^IA TKTDSDDIJ£VCGEDVPgTEBDEFETIBXnfZ>SRIGRKa^^ 

M CBD~1 MKTDSDDIXBVCCTDVPQPBDEEPETIERVMIXRVGIWGATCATTTI 



C CBD-XA 
M CBD~1 



HUMAN 

C CBD'IA 

M CBD-l 



KTXEI [ZIQYLIKWKGWSHIHinVBTEETLKQQl^ /RGKiaa^DMYKKKDQETKRWLKKAS 
EKSKEI SEXQYLIKHKGHSHIHNTWETEETLKOQK n^GHtQODHYKXKDQBTKRHLKKAS 
BRNKEP 3DIQYLIKWKGWSHIHNTHBTEBTLKQQN mGMKKLDNYKKXDQSTKRWLKKAS 



PEEJVBYm:QC»LTDDLHKQTOIVBRTNXSPOSKSAAS» 

PEDVEyYNCQQBLTDOLHKQYQIVBRI lABSNQKSAAGTPDYYCiCMQGLPYSECSWEDGA 
PEDVZYYNCQQBLITOrfKQYQIVERI lABSNQKSAAGLPDYYCKNQ^ 



C CHD-IA 
/f CHD-i 



LIAKKFQARIDEY?SRtlOSKTTPFiax:rVTXQRPR7VAUCKQPSYXOGHE 5LBLRDYQLN 
LISKKFQTCIDEYFSRNQSKlTPFKlXlCVIJ^QRPRrVALKKQPSYIOGHB SLBLRDYOLN 



GL19WIABSWCXGNSCI]JU)EHSL(aCTIQTXSFIinrtfHEBQLYGP^ 
GLNWIABSWCKGNSCIIJU3EM6LGKTIQTISFUm«FHEHQLYGPFIXVVPI£TLT^ 
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K CaO-l EIQTWASQKNAVVYLGDINSRNMIRTHBWMHPQTiaU-KKNIIXTr^ 

C CHO-IA HAFIGVDEAERLKNDDSLLYRTLIDFKSNERUilTOTPLQHSZJCBUfSIJ^ 

M CBD-l WAFIGVDEAHRLIOnXiSIiYKTLIDPTSNHRI^ITOTPLQNSLP^LWSI^ 

C CffD-LA WEDPTEEHGKGREYGYASIJnCEIJ5PriJJ«?VKKDVEKSU>AFVEQII^^ 

K CHD-J WEDrEEKHGKGREYGYASimELEPFLIJUWraUJVEJ^IJ^AKVEQIXJ^M^^ 

C CHD-IA WILTRNYKAI£KGSKGSTSGFIJ<I«MEUa^CCiraCYLIKFPDDNEFYNKOBALOm 

>f CHO-i WILTRNYKAI£KGSKGSTSGTLNIMMEU<JCCCNHCYLIK3>PDNNEFyNK 



C CffD-JA GKLILrJ5KlXIRIJOTlGNRVLIFSQMVKmr>IIJ^YIJCYTO 



SUBSTITUTE SHEET (RULE 26) 



wo 96/39505 



PCT/GB96/01341 



10/18 



C CBD-IA 
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OVNAKLVXSHEEEIAPLHKSXPSaPESMRYVIPYBTMAHFDXEV^GKBDDSNI^ 

QVN»LVXAHE0ELXPX£KSXPSDPEEIUCQYrXPCBTlUU^PDXDMC3KEIX>SN^ 

QVKMOiVXSHEEELAPLHKSXPSDFESRKRYVlPCBnUU^FDXDIIGraDD^^ 
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KKSVVSDAFVBXTASGBPVPXABESEELDQKTFSXCKERMRFVKAI^ 
KKIPIJiOTPVHXTATSEFVPXSEESEEIiB^TFSVCKSRKRPVKAAI.KQIi)IU>EKC^^ 



EQXJEHTRQCLXKXGOe XTECIiKE YSNPEQXKQWRKNLWIFVSKnSFDARKLHKLYKHAI 
EQIZBTRQCLIKIGOeiTECLKEYTOPEQXKQ«mQCLWXFVSKrEErD^^ 



KKR 3BSQQNS0QH-SNVArrSVIRKPDMEIU.KENaKHDDSSRDSYSSDRHLSQTHDBHKD 
KKRpBSQQHMDQNXSSHVKTHVXIUIPDVEIUnTTNBDOSSRr^YSSDRHI^ 



RHQGDSYK:KSDSRiOTYSSFSNGra>HREWDHYRQ0SRYYSDREIOTlKU5DHR^ 
RBQGDAYKKSDSI0a%PYSAPSNGKDHKDWI>BYK0D6RYYSDS-KHRKIJ3I^^ 



C CHD-IA BGGIJa>-RC HSI)HRSHSDHRMBSDHRS TPSTHIINPPRDYRYIiSDWIJ>HRAASSGPRCT 
K Cfl£>-J EGNLKDSR GHSPHRSHSDHRIHSDBRS TSEYSHHKSSRDYRYHSDWDHDHR?ISG^^ 

$ 

C CB£>-1A LDQRSPYGSRSP FEHSABHRSTPEHTtfSSRKTXQKWiSLSSGTLFXP 

M CBD-l LDQRSPYGSRSPLGHRSPFEHSSDHKSTFEBTVSSRKTXQRLTFSGPSFXPYTVNXHSNC 



C CffZ>-ZA LTXLERYGIJ)ILSVAVUXISRMQGIiSQOTWIF^mCVYAALCCKCX:GTFFIJ^^ 

C CH0-iA LQGPOflCPFQTCSYYKTI£VKVNa>GXTQIKlx:iJCMNTX 

C CHD-IA NS<X;i£Sl£KATCI«TU^PCRFSSOAXIFXFCTYSCKIARXSPVCDQIJtCIjrMKQTNK 

C CBD-IA QKTIKKKNTTKPTNQCKIJJCljOCMSFFPSGFFWIJI-SPTTOM'F 

C CTr>-lA SECITCECNII^LVLIJ'YWIIXHTCFWIJTFIFFFYXTVSWIVVWISEHIPI^^ 

C CHZJ-IA AFQVHWTKRRKCSIGEHFKTQISODSlJCIHIJSLFNMGlWVKCAMQQLIFXXIXWni 

C CflD-lA LLQCTLIVHRXLLSDKLNXUa>KCT 
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Figure 16. 
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